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1. INTRODUCTION 

One of the most successful applications of computational logic is Formal Verification, It 
that aims at proving (or disproving) certain properties of the behaviours of a reactive sys- 
tem. In recent years, also thanks to the impressive improvements of SAT solvers, a wide 
variety of verification methods based on SAT solving have been proposed. These methods 
proved effective for discrete state systems, most notably hardware components. The ap- 
proach is made practical by the fact that SAT solvers, in addition to proving efficiently the 
satisfiability of huge propositional formulas, provide several functionalities, such as model 
generation, proof production, extraction of unsatisfiable cores, and generation of Craig in- 
terpolants (interpolation). In particular, since the seminal paper of McMillan [McMillan 
2003], interpolation has been recognized to be a substantial tool for verification in the case 
of Boolean systems [Cabodi et al. 2006; Li and Somenzi 2006; Marques-Silva 2007]. 

One of the main limitations of SAT-based approaches, is in their expressive power Many 
systems of practical interest, containing integer or real valued variables, such as software, 
and timed and hybrid systems, can not be represented directly within propositional logic. 
This has prompted research in the analysis of fragments of first order logic: given a for- 
mula referring to variables, the problem is to find a satisfying assignment in a theory of 
interest (e.g. linear arithmetic). This field, referred to as Satisfiability Modulo Theory 
(SMT), has resulted in substantial theoretical results, and in very effective decision pro- 
cedures, known as SMT solvers. State of the art SMT solvers complement the Boolean 
SAT algorithms with specialized decision procedures for conjunctions of literals in some 
given theory {theory solvers). In addition to checking satisfiability, SMT solvers are able to 
generate models, produce proofs, and extract unsatisfiable cores. This has allowed, to lift 
many SAT-based verification algorithms to SMT-based verification, as well as to to open 
up the way to abstraction-refinement with SMT. 

Quite surprisingly, however, the research on interpolation for SMT has not kept the 
pace of SMT solving. In fact, the current approaches to producing interpolants for frag- 
ments of first order theories [McMillan 2005; Yorsh and Musuvathi 2005; Rybalchenko 
and Sofronie-Stokkermans 2007; Kroening and Weissenbacher 2007; Kapur et al. 2006; 
Jain et al. 2008] all suffer from a number of problems. Some of the approaches are severely 
limited in terms of their expressiveness. For instance, the tool described in [Rybalchenko 
and Sofronie-Stokkermans 2007] can only deal with conjunctions of literals, whilst the 
recent work described in [Kroening and Weissenbacher 2007] can not deal with many 
useful theories. Furthermore, very few tools are available [Rybalchenko and Sofronie- 
Stokkermans 2007; McMillan 2005], and these tools do not seem to scale particularly well. 
More than to naive implementation, this appears to be due to the underlying algorithms, 
that substantially deviate from or ignore choices common in state-of-the-art SMT. For in- 
stance, in the domain of linear arithmetic over the rationals {CA{Q)), strict inequalities 
are encoded in [McMillan 2005] as the conjunction of a weak inequality and a disequality; 
although sound, this choice destroys the structure of the constraints, forces reasoning in 
the combination of theories CA{Q) U £UJ-, requires additional splitting, and ultimately 
results in a larger search space. Similarly, the fragment of Difference Logic (P£(Q)) is 
dealt with by means of a general-purpose algorithm for full CA{^), rather than one of 
the well-known and much faster specialized algorithms. An even more fundamental ex- 
ample is the fact that state-of-the-art SMT reasoners use dedicated algorithms for Linear 
Arithmetic [Dutertre and de Moura 2006]. 
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In this paper, we tackle the problem of generating interpolants for SMT problems, fully 
leveraging the algorithms used in a state of the art SMT solver. In particular, our main 
contributions are; 

(1) An interpolation algorithm for CA{Q) that exploits a variant of the algorithm pre- 
sented in [Dutertre and de Moura 2006], and that is capable of handling the full 
CA{Q)- including strict inequalities and disequalities - without the need of theory 
combination; 

(2) An algorithm for computing interpolants in VC - both over the rationals and over the 
integers - that builds on top of the efficient graph-based decision algorithms given 
in [Cotton and Maler 2006; Nieuwenhuis and Oliveras 2005], that ensures that the 
generated interpolants are still in the VC fragment of linear arithmetic, and that allows 
for computing stronger interpolants than the existing algorithms for the full linear 
arithmetic; 

(3) An algorithm for computing interpolants in UTVVI- both over the rationals and over 
the integers - that builds on an encoding of DC. The algorithm ensures that the gener- 
ated interpolants are still in the UTVVI fragment of linear arithmetic, and that allows 
for computing stronger interpolants than the existing algorithms for the full linear 
arithmetic; 

(4) An algorithm for computing interpolants in a combination Ti U72 of theories based on 
the Delayed Theory Combination (DTC) method [Bozzano et al. 2006; Bruttomesso 
et al. 2008a] (as an alternative to the traditional Nelson-Oppen method), which does 
not require ad-hoc interpolant combination methods, but exploits the propositional 
interpolation algorithm for performing the combination of theories; 

(5) An efficient implementation of all the proposed techniques within the MathSAT 4 
SMT solver [Bruttomesso et al. 2008b], and an extensive experimental evaluation on 
a wide range of benchmarks. 

This comprehensive approach advances the state of the art in two main directions: on 
one side, we show how to extend efficient SMT solving techniques to SMT interpolation, 
for a wide class of important theories, without paying a substantial price in performance; on 
the other side, we present an interpolating SMT solver that is able to produce interpolants 
for a much wider class of problems than its competitors, and, on problems that can be 
dealt with by other tools, shows dramatic improvements in performance, often by orders 
of magnitude. 

Content. The paper is structured as follows. In §2 we present some background on in- 
terpolation in SMT. In §3, §4 and §5 we show how to efficiently interpolate £^(Q), VC 
and UTVVI respectively. In §6 we discuss interpolation for combined theories. The 
proposed techniques are experimentally evaluated in §7. In §8 we draw some conclusions, 
and outline directions for future work. The discussion of related work is distributed in the 
technical sections (§3-§6). 

Note to reviewers. Some of the material contained in this paper, in a less detailed form, 
has been published in two conference papers [Cimatti et al. 2008; 2009]. 
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2. BACKGROUND AND STATE-OF-THE-ART 
2.1 Satisfiability Modulo Theory - SMT 

Our setting is standard first order logic. A 0-ary function symbol is called a constant. A 
term is a first-order term built out of function symbols and variables. We write ti = t2 
when the two terms <i and t2 are syntactically identical. If ii, . . . , i„ are terms and p is 
a predicate symbol, then p(ti, . . . , t„) is an atom. A literal is either an atom or its nega- 
tion. A formula 4> is built in the usual way out of the universal and existential quantifiers. 
Boolean connectives, and atoms. We call a formula quantifier-free if it does not contain 
quantifiers, and ground if it does not contain free variables. A clause is a disjunction of 
literals. A formula is said to be in conjunctive normal form (CNF) if it is a conjunction 
of clauses. For every non-CNF T-formula (p, an equisatisfiable CNF formula tp can be 
generated in polynomial time [Tseitin 1968]. 

We also assume the usual first-order notions of interpretation, satisfiability, validity, log- 
ical consequence, and theory, as given, e.g., in [Enderton 1972]. A first-order theory, T, 
is a set of first-order sentences. In this paper, we consider only theories with equality. 
A structure A is a model of a theory T if A satisfies every sentence in T. A formula is 
satisfiable in T (or T-satisfiable) if it is satisfiable in a model of T. 

We call Satisfiability Modulo (the) Theory T, SMT(T), the problem of deciding the 
satisfiability of quantifier- free formulas ' with respect to a background theory T. We denote 
formulas with (p, tp, A, B, C, I, T-variables with x, y, z. Boolean variables with p, q and 
numeric constants with a, h,c,l, u. Given a theory T, we write |=r V' (or simply (p \= tp) 
to denote that the formula ip is a logical consequence of (j) in the theory T. With (f> 
we denote that all uninterpreted (in T) symbols of cj) appear in t/j. If C is a clause, C i B 
is the clause obtained by removing all the literals whose atoms do not occur in B, and 
C\B that obtained by removing all the literals whose atoms do occur in B. With a little 
abuse of notation, we might sometimes denote conjunctions of literals Zi A . . . A Z„ as sets 
{Zi, . . . , Z„} and vice versa. If 77 = {Zi, . . . , Z„}, we might write -177 to mean V. . . V^Z„. 
A theory T is stably -infinite iff every quantifier-free T-satisfiable formula is satisfiable in 
an infinite model of T. A theory T is convex iff, for every collection Zi , . . . , Z^ , ei , . . . , e„ 
of literals in T s.t. ei, . . . , e„ are in the form {x = y), x, y being variables, we have that 
{Zi, ...,lk} \=T Vr=i if and only if {li,...,lk} \=r for some 1 < 7 < n. 

Given a decidable first-order theory T, we call a theory solver for T, T -solver, any tool 
able to decide the satisfiability in T of sets/conjunctions of ground atomic formulas and 
their negations — theory literals or T -literals — in the language of T. If 5* = {Zi , . . . , Z„ } 
is a set of literals in T, we call (T)-conflict set any subset 77 of 5* which is inconsistent in 
T. - We call -177 a T-lemma. (Notice that ^77 is a T- valid clause.) 

Definition 2.1 Resolution proof. Given a set of clauses 5* == {Ci, C„} and a clause 
C, we call a resolution proof of the deduction /\^ Ci \=^r C a DAG V such that: 

(1) C is the root of 7^; 

(2) the leaves of V are either elements of S or T-lemmas; 

^The general definition of SMT deals also with quantified formulas. Nevertheless, in this paper we restrict our 
interest to quantifier-free formulas. 

■^In the next sections, as we are in an SMT(T) context, we often omit specifying "in the theory T" when speaking 
of consistency, validity, etc. 
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1. SatValue Lazy.SMT.Solver (T-formula 4>) { 

2. 4^' ~ convert_to_cnf ((/>) 

3. 4P=T2Vi<f>') 

4. while {DPLL{<l}P,iiP) == sat) { 

5. {p,ri) = T-solver{V2T{fiP)) 

6 . if ( p == sat ) tlien return sat 

7 . =^P A T2Vi^v) 

8. } 

9 . return unsat 

10. } 



Fig. 1. A simplified scliema for lazy SMT{T) procedures. 

(3) each non-leaf node C" has two premises Cp^ and Cp.^ such that Cp^ = p\/ 4>i, Cp^ = 
-^pV 4)2, and C" = 01 V 02- The atom p is called the pivot of Cp^ and Cp.^ . 

If C is the empty clause (denoted with ±), then T' is a resolution proof of ( T- )unsatisfiability 
for A, a. 

We consider the SMT(T) problem for some background theory T. 

Definition 2.2 Craig Interpolant. Given an ordered pair (A, B) of formulas such that 
A /\ B \=T ^, a Craig interpolant (simply "interpolant" hereafter) is a formula / s.t.: 

(i) A hr /, 

(ii) / A B hr -L, 

(iii) I ^ AwAI ^ B. 

2.2 Algorithms for SMT 

A standard technique for solving the SMT(T) problem is to integrate a DPLL-based SAT 
solver and a T-solver in a ''lazy'" manner The idea underlying every lazy SMT(T) proce- 
dure is that (a complete set of) the truth assignments for the propositional abstraction of 
are enumerated and checked for satisfiability in T; the procedure either returns sat if one 
T-satisfiable truth assignment is found, or it returns unsat otherwise. 

Figure 1 presents a simplified schema of a lazy SMT(T) procedure, called the off-line 
schema. The bijective function T2V ("Theory-to-Boolean"), called Boolean abstraction, 
maps Boolean atoms into themselves and non-Boolean T-atoms into fresh Boolean atoms 
— so that two atom instances in (j> are mapped into the same Boolean atom iff they are 
syntactically identical — and extends to T-formulas and sets of T-formulas in the obvious 
way — i.e., r27'(-0i) = -^T2V{(t>i), T2P(</)i [X ^2) = T2V{(t)i) xi T2V{(t)2) for each 
Boolean connective [xi, T2'P({0i},;) = {T2'P{4>i)}i. The function V2T ("propositional- 
to-theory"), called refinement, is the inverse of T2V. The propositional abstraction of 
the input formula is given as input to a SAT solver based on the DPLL algorithm [Davis 
et al. 1962; Zhang and Malik 2002], which either decides that (pf is unsatisfiable, and hence 
(f) is T-unsatisfiable, or returns a satisfying assignment jjP; in the latter case, V2T{^p) is 
given as input to T-solver. If V2T{fiP) is found T-consistent, then 4> is T-consistent. If 
not, T-solver returns the conflict set t] which caused the T-inconsistency of V2T{^p); 
the abstraction of the T-lemma -177, T2V i^i]), is then added as a clause to 4>p. Then the 
DPLL solver is restarted from scratch on the resulting formula. 

Practical implementations follow a more elaborated schema, called the on-line schema 
(see [Sebastiani 2007]). As before, (pP is given as input to a modified version of DPLL, and 

ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY. 



Efficient Generation of Craig Interpolants in Satisfiability IVIodulo Theories • 7 



when a satisfying assignment jjP is found, the refinement fj, of fi^ is fed to the T-solver; 
if ^ is found T-consistent, then is T-consistent; otherwise, T-solver returns the con- 
flict set T] which caused the T-inconsistency of V2T{fiP). Then the clause -^rf is added in 
conjunction to (f)'', either temporarily or permanently {T -learning), and the algorithm back- 
tracks up to the highest point in the search where one of the literals in is unassigned 
(T-backjumping), and therefore its value is (propositionally) implied by the others in -^rf. 
Another important improvement is early pruning (EP): before every literal selection, in- 
termediate assignments are checked for T-satisfiability and, if not T-satisfiable, they are 
pruned (since no refinement can be T-satisfiable). Finally, theory propagation can be used 
to reduce the search space by allowing the T-solvers to explicitly return truth values for 
unassigned literals, which can be unit-propagated by the SAT solver. The interested reader 
is pointed to, e.g., [Sebastiani 2007] for details and further references. 

With a small modification of the embedded DPLL engine, a lazy SMT solver can also 
be used to generate a resolution proof of unsatisfiability (see e.g. [van Gelder 2007]). 

2.3 Interpolation in SMT 

The use of interpolation in formal verification has been introduced by McMillan in [McMil- 
lan 2003] for purely-propositional formulas, and it was subsequently extended to handle 
SMT{£UTUCA{Q)) formulas in [McMiflan 2005], SUTheing the theory of equality and 
uninterpreted functions. The technique is based on earlier work by Pudlak [Pudlak 1997], 
where two interpolant-generation algorithms are described: one for computing interpolants 
for propositional formulas from resolution proofs of unsatisfiability, and one for generating 
interpolants for conjunctions of (weak) linear inequalities in CA{Q). An interpolant for a 
pair {A, B) of CNF formulas is constructed from a resolution proof of unsatisfiability of 
A/\B, generated as outlined in §2. 1 . The algorithm works by computing a formula Ic for 
each clause in the resolution refutation, such that the formula associated to the empty 
root clause is the computed interpolant. The algorithm can be described as follows: 

Algorithm 1: Interpolant generation for SMT(T) 

(1) Generate a resolution proof of unsatisfiability V for A l\ B. 

(2) For every T-lemma -i?/ occurring in V, generate an interpolant 7-,^ for {7]\B, ij | B). 

(3) For every input clause C inV, set Ic " C [ B if C & A, and Ic "T if C £ B. 

(4) For every inner node C of V obtained by resolution from Ci = p V 0i and C2 = 

V 02, set Ic = Ici V Ic2 if P does not occur in B, and Ic Ici A Ic2 otherwise. 

(5) Output I± as an interpolant for {A, B). 



Example 2.1. Consider the following two formulas in CA{Q): 

A={py {Q< xi - ix2 + 1)) A (0 < + X2) A {^q V ^(0 < xi + X2)) 

B ^ (^(0 < 2:3 - 2x1 - 3) V (0 < 1 - 2x3)) A {^p V g) A (p V (0 < - 2x1 - 3)) 

Figure 2(a) shows a resolution proof of unsatisfiability for A A B, in which the clauses 
from A have been underlined. The proof contains the following CA{Q)-letmna (displayed 
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-i(0 < Xi - 3X2 + 1) V ^(0 < Xi + X2)V 

^(0 < X3 - 2xi - 3) V ^(0 < 1 - 2x3) 

\ ^(0 < 13 - 2.T1 - 3) V (0 < 1 - 2.1-3) 

\ / 
-n(0 < xi - 3x2 + 1) V ^(0 < ii + 1-2) V 

-(0 < I3 - 2a:i - 3) p V (0 < t, - 2ii - 3) 



-.(0 < Ti - 3.r2 + 1) V ^(0 < J.'i + .T2) V p 
P V (0 < - 3.T2 + 1) / 



-■(0 < .Ti + .T2) V p 



-1(0 < xi +12) 



-nq V -.(0 < ,Ti + 12) 



(0 < ii + 12) - + ■''2) 



(a) 



(0 < 4X1 + 1) 



(0 < 4.1-1 + 1) 



(0 < 4.T1 + 1) 



p V (0 < 4.T1 + 1) 



p V (0 < 4.T1 + 1) 



^ (pV(0<4.x-i + l))A^g 



(pV(0<4xi + l)) 

(b) 



Fig. 2. Resolution proof of unsatisfiability (a) and interpolant (b) for tlie pair ( A, B) of formulas of Example 2. 1 . 
In the tree on the left, T-lemmas are displayed in boldface, and clauses from A are underlined. 



in boldface): 

-.(0 <xi- 3x2 + 1) V -.(0 <xi+ X2) V ^(0 < 2:3 ~ 2x1 - 3) V ^(0 < 1 - 2x3). 

Figure 2(b) shows, for each clause Qi in the proof the formula Iq- generated by Algorithm 
1. For the CA{Q)-lemma, it is easy to see that (0 < 4x1 + 1) is an interpolant for 
((0 < xi - 3x2 + 1) A (0 < xi + X2), (0 < X3 - 2x1 - 3) A (0 < 1 - 2x3)) as required 
by Step 2 of the algorithm. (We will show how to obtain this interpolant in Example 2.2.) 
Therefore, /j_ = (p V (0 < 4xi + 1)) A -^q is an interpolant for (A, B). 

Algorithm 1 can be applied also when A and B are not in CNF. In this case, it suffices to 
pre-convert them into CNF by using disjoint sets of auxiliary Boolean atoms in the usual 
way [McMillan 2005]. 

Notice that Step 2. of the algorithm is the only part which depends on the theory T, 
so that the problem of interpolant generation in SMT(T) reduces to that of finding inter- 
polants for T-lemmas. To this extent, in [McMillan 2005] McMillan gives a set of rules for 
constructing interpolants for T-lemmas in the theory of EUT, that of weak linear inequal- 
ities (0 < t) in CA{Q), and their combination. Linear equalities (0 = t) can be reduced 
to conjunctions (0 < i) A (0 < —t) of inequalities. Thanks to the combination of theories, 
also strict linear inequalities (0 < t) can be handled in £UJ- U LA{Q) by replacing them 
with the conjunction (0 < t) A (0 ^ t),'^ but this solution can be very inefficient. 

The combination £IAJ- U CA{^) can also be used to compute interpolants for other 
theories, such as those of lists, arrays, sets and multisets [Kapuret al. 2006]. 

In [McMillan 2005], interpolants in the combined theory £UJ- U CA{^) are obtained 



''The details are not given in [McMillan 2005]. One possible way of doing this is to rewrite (0 7^ t) as {y = 
t) A {z = 0) A [z y), z and y being fresh variables. 
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LeqEq 



= t 
< t 



Comb 



0<ti 0<t2 
< Clti + C2t2 



Ci , C2 > 



Fig. 3. £yl((Q)-proof lules for a conjunction F of equalities and wealc inequalities. 



by means of ad-hoc combination rules. The work in [Yorsh and Musuvathi 2005], instead, 
presents a method for generating interpolants for 7^ U 7^ using the interpolant-generation 
procedures of Ti and T2 as black-boxes, using the Nelson-Oppen approach [Nelson and 
Oppen 1979]. 

Also the method of [Rybalchenko and Sofronie-Stokkermans 2007] allows to compute 
interpolants in £UJ- U CA{Q). Its peculiarity is that it is not based on unsatisfiability 
proofs. Instead, it generates interpolants in CA{Q) by solving a system of constraints using 
an off-the-shelf Linear Programming (LP) solver The method allows both weak and strict 
inequalities. Extension to uninterpreted functions is achieved by means of reduction to 
CA{Q) using a hierarchical calculus [Sofronie-Stokkermans 2006]. The algorithm works 
only with conjunctions of atoms, although in principle it could be integrated in Algorithm 
1 to generate interpolants for T-lemmas in CA{Q). As an alternative, the authors show in 
[Rybalchenko and Sofronie-Stokkermans 2007] how to generate interpolants for formulas 
that are in Disjunctive Normal Form (DNF). 

Another different approach is explored in [Kroening and Weissenbacher 2007]. There, 
the authors use the eager SMT approach to encode the original SMT problem into an 
equisatisfiable propositional problem, for which a propositional proof of unsatisfiability 
is generated. This proof is later "lifted" to the original theory, and used to generate an 
interpolant in a way similar to Algorithm I. At the moment, the approach is however 
limited to the theory of equality only (without uninterpreted functions). 

All the above techniques construct one interpolant for {A,B). In general, however, 
interpolants are not unique. In particular, some of them can be better than others, depending 
on the particular application domain. In [Jhala and McMillan 2005], it is shown how to 
manipulate proofs in order to obtain stronger interpolants. In [Jhala and McMillan 2006; 
2007], instead, a technique to restrict the language used in interpolants is presented and 
shown to be useful in preventing divergence of techniques based on predicate abstraction. 

One of the most important applications of interpolation in Formal Verification is ab- 
straction refinement [Henzinger et al. 2004; McMillan 2006]. In such setting, every input 
problem <j) has the form </> = 01 A . . . A 0„, and the interpolating solver is asked to compute 
several interpolants Ii, . . . , /„_i corresponding to different partitions of (p into Ai and Bi, 
such that 



A sufficient condition for (2) to hold is that all the /^'s are computed from the same proof 
of unsatisfiability 11 for (f> [Henzinger et al. 2004]. 

2.3.1 Interpolants for conjunctions ofCA{Q)-literals. We recall the algorithm of [McMil- 
lan 2005] for computing interpolants from £^(Q)-proofs of unsatisfiability, for conjunc- 
tions of equalities and weak inequalities in CA{Q). 



Wi, = 01 A ... A 0,;, and = 4>i+i A ... A (j), 



(1) 



Moreover, Ii , 



, In-i should be related by the following: 



Ii A(f>i+i li+i 



(2) 
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An CA{Q)-p roof rule R for a conjunction F of equalities and weak inequalities is either 

P 

an element of F, or it has the form — , where <f> is an equality or a weak inequality and P 

4> 

is a sequence of proof rules, called the premises of R. An CA{Q)-proof of unsatisfiability 
for a conjunction of equalities and weak inequalities F is simply a rule in which cj) = < c 
and where c is a negative numerical constant.^ 

Similarly to [McMillan 2005], we use the proof rules of Figure 3: LeqEq for deriving 
inequahties from equalities, and COMB for performing linear combinations.^ 

Given an £^(Q)-proof of unsatisfiability P for a conjunction F of equalities and weak 
inequalities partitioned into {A, B), an interpolant / can be computed simply by replacing 
every atom < t occurring in B (resp. = t) with < (resp. = 0) in each leaf sub- 
rule of P, and propagating the results; the interpolant is then the single weak inequality 

< t at the root of P [McMillan 2005]. 

Example 2.2. Consider the following sets of CA{Q) atoms: 

A {(0 < a;i - ix2 + 1), (0 < xi + x^)} 
B = {(0 < 2:3 - 2x1 - 3), (0 < 1 - 2x3)}. 
An CA{Q)-proof of unsatisfiability P for A A B is the following: 

1 * (0 < a;i - 3a;2 + 1) 4 * (0 < + xa) 2 * (0 < 2:3 - 2a:i - 3) 1 * (0 < 1 - 22:3) 

1 * (0 < 4a;i + 1) 1 * (0 < -4a;i - 5) 

(0 < -4) 

By replacing inequalities in B with (0 < 0), we obtain the proof P' : 

1 * (0 < a;i - 3a;2 + 1) 4 * (0 < + 0:2) 2 * (0 < 0) 1 * (0 < 0) 

1 * (0 < 4x1 + 1) 1 * (0 < 0) 

(0 < 4X1 + 1) 

Thus, the interpolant obtained is (0 < 4.ti + 1). 

3. FROM SMT(/:^(Q)) SOLVING TO SMT(/:^(Q)) INTERPOLATION 

Traditionally, SMT solvers used some kind of incremental simplex algorithm [Vanderbei 
2001] as T-solver for the CA{Q) theory. Recently, Dutertre and de Moura [Dutertre and 
de Moura 2006] have proposed a new simplex-based algorithm, specifically designed for 
integration in a lazy SMT solver. The algorithm is extremely suitable for SMT, and SMT 
solvers embedding it were shown to significantly outperform (often by orders of magni- 
tude) the ones based on other simplex variants. It has now been integrated in several SMT 
solvers, including ArgoLib, CVC3, MathSAT, Yices, and Z3. Remarkably, this algo- 
rithm allows for handling also strict inequalities. 

In this Section, we show how to exploit this algorithm to efficiently generate interpolants 
for CA{^) formulas. Combined with the interpolation for the SMT(T) problem described 
in is then obtained by combining the general In §3.1 we begin by considering the case 



In the following, we might sometimes write ± as a synonym of an atom "0 < c" when c is a negative numerical 
constant. 

^In [McMillan 2005] the LeqEq rule is not used in CA{Q), because the input is assumed to consist only of 
inequalities. 
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in which the input atoms are only equalities and non-strict inequalities. In this case, we 
only need to show how to generate a proof of unsatisfiability, since then we can use the 
interpolation rules defined in [McMillan 2005]. Then, in §3.2 we show how to generate 
interpolants for problems containing also strict inequalities and disequalities. 

3.1 Interpolation with non-strict inequalities 

3.1.1 The original Dutertre-de Moura algorithm. In its original formulation, the Dutertre- 
de Moura algorithm assumes that the variables Xi are partitioned a priori in two sets, here- 
after denoted as B ("initially basic" or "dependent") and M ("initially non-basic" or "in- 
dependent"), and that the algorithm receives as inputs two kinds of atomic formulas: ^ 

a set of equations eq.^, one for each Xi e B, of the form X].t eTV' ^'i + o-a^i = s.t. 
all dij's are numerical constants; 

elementary atoms of the form Xj > Ij or xj < uj s.t. xj ^ B U JV and Ij, uj are 
numerical constants. 

In order to handle problems that are not in the above form, a satisfiability-preserving 
preprocessing step is applied upfront, before invoking the algorithm. 
The initial equations eq^ are then used to build a tableau T: 

{^i =Y.XJe^r^i3^3 \ ^ (3) 

where B ("basic" or "dependent"), J\f ("non-basic" or "independent") and are such that 
initially B = B, N = N and a^- = —ciij/au. 

In order to decide the satisfiability of the input problem, the algorithm performs ma- 
nipulations of the tableau that change the sets B and Af and the values of the coefficients 
ttij, always keeping the tableau T in (3) equivalent to its initial version. In particular, the 
algorithm maintains a mapping f3 : BU J\f i — > Q representing a candidate model which, 
at every step, satisfies the following invariants: 

Vxj- G JV, I J < P{xj) < Uj, Vx, e B, Pixi) = EjeA/-ay^(a^j)- (4) 

The algorithm tries to adjust the values of /3 and the sets B and J\f, and hence the coef- 
ficients a.ij of the tableau, such that k < (3{xi) < Ui holds also for all the x^'s in B. 
Inconsistency is detected when this is not possible without violating any constraint in (4): 
as the bounds on the variables in M are always satisfied by (3, then there is a variable 
Xi € B such that the inconsistency is caused either by the elementary atom Xi > U or 
by the atom Xi < Ui [Dutertre and de Moura 2006]; in the first case, ^ a conflict set ?/ is 
generated as follows: 

V = {xj < Uj\xj G JV^} U {xj > lj\xj G 7V~} U {xi > k}, (5) 

where (x; = J^x -eM ^ij^i) '^^^ °f current version of the tableau T (3) corre- 

sponding to Xi, Af^ is {xj G M\aij > 0} and A/"" is {xj G A/'|ay < 0}. 

Notice that ry is a conflict set in the sense that it is made inconsistent by (some of) the 
equations in the tableau T (3), i.e. T U ?/ \=^cA(Q) -L- In general, however, rj ^cA{Q) -L- 



Notationally, we use the hat symbol " to denote the initial value of the generic symbol. 
Here we do not consider the second case Xi < Ui as it is analogous to the first one. 
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3. 1 .2 Our proof-producing variant. In order to make it suitable for interpolant gener- 
ation, we have conceived the following variant of the Dutertre-de Moura algorithm. 

We take as input an arbitrary set of inequalities Ik < J2h ^^h Uh or Uk > J2h ^fc'i 
and apply an internal preprocessing step to obtain a set of equations and a set of elementary 
bounds. In particular, we introduce a "slack" variable Sk for each distinct term J2h ^kJi Vh 
occurring in the input inequalities. Then, we replace such term with Sk (thus obtaining 
Ik < Sk or Uk > Sk) and add an equation Sk = J2h '^kh Uh- Notice that we introduce a 
slack variable even for "elementary" inequalities {Ik < yu)- With this transformation, the 
initial tableau T (3) is: 

{sk ^Y.h^khyh}k, (6) 

s.t. B is made of all the slack variables s^'s, N is made of all the original variables y/i's, 
and the elementary atoms contain only slack variables s/^'s. 

Then the algorithm proceeds as described above, producing a set i] (5) in case of incon- 
sistency. In our variant of the algorithm, we can use 77 to generate a conflict set r/', thanks 
to the following theorem. 

Theorem 3.1. In the set r/ of (5), Xi and all the Xj 's are slack variables introduced 
by our preprocessing step. Moreover, the set r/ = 11^/+ U ^IM- U Vi conflict set, where 

'nN'+ - {^k >J2h ^kh yh\sk = Xj and x-j e TV+I, 
Vat- - {h < J2h"-khyh\sk = xj andxj G TV"}, 
Vi — {h <J2h"-khyh\sk = Xi}. 
Proof. We consider the case in which 77 (5) is generated from a row Xi ~J2x- ^^o ^0 
in the tableau T (3) such that I3{xi) < k. In [Dutertre and de Moura 2006] it is shown that 
in this case the following facts hold: 

Vxj e Af~^ , P{xj ) = Uj, and Wxj E 7V~ , P{xj ) = lj. (7) 

(We recall that TV"*" = {xj G Af\aij > 0} and TV" = {xj G TV|ay < 0}.) The bounds 
Uj and Ij can be introduced only by elementary atoms. Since in our variant the elementary 
atoms contain only slack variables, each Xj must be a slack variable (namely Sk). The 
same holds for Xi (since its value is bounded by li). 

Now consider 77 again. In [Dutertre and de Moura 2006] it is shown that when a conflict 
is detected because P{xi) < k, then the following fact holds: 

) = E^, e7V+ «y + GAT- ««T h ■ (8) 
From the i-th row of the tableau T (3) we can derive 

0<Ex,eAA«y 2;j -Xj. (9) 

If we take each inequality < Uj — Xj multiplied by the coefficient Uij for all xj G TV"*", 
each inequality < xj — Ij multiplied by coefficient — a,;j for all xj G J\f^, and the 
inequality {0 < Xi — U) multiplied by 1, and we add them to (9), we obtain 

< "i + Ea/-- I3 - k, (10) 

which by (8) is equivalent to < j3{xi)~li. Thus we have obtained < cwithc = l3{xi)~ 
li, which is strictly lower than zero. Therefore, rj is inconsistent under the definitions in 
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T. Since we know that Xi and all the Xj's in rj are slack variables, we can replace every 
Xj (i.e., every Sk) with its corresponding term J^h ^kh Vh, thus obtaining ?/, which is thus 
inconsistent. □ 

When our variant of the algorithm detects an inconsistency, we construct a proof of 
unsatisfiability as follows. From the set 77 of (5) we build a conflict set rj' by replacing each 
elementary atom in it with the corresponding original atom, as shown in Theorem 3.1. 
Using the Hyp rule, we introduce all the atoms in rij^+, and combine them with repeated 
applications of the COMB rule; if Uk > J2h '^kh Vh is the atom corresponding to Sk, we 
use as coefficient for the COMB the (in the i-lh row of the current tableau) such that 
Sk = Xj. Then, we introduce each of the atoms in jjj^- with Hyp, and add them to the 
previous combination, again using COMB. In this case, the coefficient to use is —fly . 
Finally, we introduce the atom in ?/; and add it to the combination with coefficient 1. 

Corollary 3.2. The result of the linear combination described above is the atom 
< c, such that c is a numerical constant strictly lower than zero. 

Proof. Follows immediately by the proof of Theorem 3.1. □ 

Besides the case just described (and its dual when the inconsistency is due to an elemen- 
tary atom Xi < Ui), another case in which an inconsistency can be detected is when two 
contradictory atoms are asserted: 1^ < ^kh Vh and Uk > J2h ^kh Vh, with Ik > Uk. In 
this case, the proof is simply the combination of the two atoms with coefficient 1 . 

The extension for handling also equalities like bk ~ J2h ^kh Vh is straightforward: we 
simply introduce two elementary atoms bk < Sk and bk > Sk and, in the construction of 
the proof, we use the LeqEq rule to introduce the proper inequality. 

Finally, notice that the current implementation in MathS AT (see §7) is slightly different 
from what presented here, and significantly more efficient. In practice, rj, ?/ are not con- 
structed in sequence; rather, they are built simultaneously. Moreover, some optimizations 
are applied to eliminate some slack variables when they are not needed. 

Example 3.1. Consider again the two sets of CA{Q) atoms of Example 2.2: 

A = {(0 < xi - 3x2 + 1), (0 < xi + X2} 

B = {(0 <X3- 2x1 - 3), (0 < 1 - 2x3)}. 

With our variant of the Dutertre-de Moura algorithm, four "slack" variables are intro- 
duced, resulting in the following tableau and elementary constraints: 



S\ = 


Xi — 3X2 


-1 


< 


Sl 


S2 = 


Xi + X2 





< 


S2 


S3 


X3 — 2xi 


3 


< 


S3 


. S4 = 


-2.T3 


-1 


< 


Si 



To detect the inconsistency, the algorithm performs some pivoting steps, resulting in the 
final tableau T': 

{X2 = -T2S4 - - \si 
1 2 1 
S2 — ^3*4 — ^S3 — 3S1 
Xl = - jSi - 2S3 
2:3 = -is4 
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The final values of [3 are as follows: 

(3{xi) = I (3{X2) = /3(x3) = \ 

P{si) = -l /?(s2) = -f /?(S3) = 3 /3(S4) = -1 

Therefore, the bound (0 < S2) is violated. From the second row ofT', the set t] and the 
conflict set 77' are computed: 

r/ = U {(-1 < 54), (3 < S3), (-1 < si)} U {(0 < S2)} 

r/ = U {(0 < 1 - 2x3), (0 < X3 - 2.T1 - 3), (0 < xi - + 1)} U {(0 < xi + X2)} 
The generated proof of unsatisfiability P is: 

i * (0 < 1 - 2x3) I * (0 < 2:3 - 211 - 3) 

l*(0<-|xi-|) i * (0 < si - 3x2 + 1) 

1 * (0 < -xi - 12 - |) 1 * (0 < 2:1 + a;2) 

(o<-|) 

A/fer replacing the inequalities of B with (0 < 0) in P, the new proof P' is: 
i * (0 < 0) I * (0 < 0) 

1 * (0 < 0) i * (0 < a;i - 3x2 + 1) 

1 * (0 < |xi - X2 + i) l*(0<X-i+X2) 

(0< + 

Thus the computed interpolant is (0 < ^Xi + i) (which is equivalent to that of Exam- 
ple 2.2). 

3.2 Interpolation with strict inequalities and disequalities 

Another benefit of the Dutertre-de Moura algorithm is that it can handle strict inequalities 
directly. Its method is based on the following lemma. 

Lemma 3.3 Lemma 1 in [Dutertre and de Moura 2006]. A set of linear arith- 
metic atoms r containing strict inequalities S = {0 < ii, . . . , < tn} is satisfiable iff 
there exists a rational number e > such that = (T U S^) \ S is satisfiable, where 
Se = {e<h,...,e<t„}. 

The idea of [Dutertre and de Moura 2006] is that of treating the infinitesimal parameter e 
symbolically instead of explicitly computing its value. Strict bounds {x < h) are replaced 
with weak ones {x < h ~ e), and the operations on bounds are adjusted to take e into 
account. 

We extend the same idea to the computation of interpolants. We transform every atom 
(0 < ti) occurring in the proof of unsatisfiability into (0 < — e). Then we compute an 
interpolant in the usual way. As a consequence of the rules of [McMillan 2005], is 
always a single atom. As shown by the following lemma, if contains e, then it must be 
in the form (0 < < — ce) with c > 0, and we can rewrite into (0 < t). 

Theorem 3.4 Interpolation with strict inequalities. LetV, S, and 
be defined as in Lemma 3.3. Let T be partitioned into A and B, and let A^ and B^ be 
obtained from A and B by replacing atoms in S with the corresponding ones in S^. Let Ig 
be an interpolant for (A^, B^). Then: 
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Ife le, then is an interpolantfor (A, B). 

Ifs^Ie, then Is = {0 < t ~ ce) for some c > 0, and I = {0 < t) is an interpolant 
for {A,B). 

Proof. Since the side condition of the Comb rule ensures that equations are combined 
only using positive coefficients, and since the atoms introduced in the proof either do not 
contain e or contain it with a negative coefficient, if e appears in J^, it must have a negative 
coefficient. 

If £ does not appear in 1^, then 1^ has been obtained from atoms appearing in A or B, 
so that /e is an interpolant for (A, B). 

If e appears in /g, since its value has not been explicitly computed, it can be arbitrarily 
small, so thanks to Lemma 3.3 we have that Be A 1^ \=cA{Q) -L implies B f\I ^cA(Q) ^■ 

We can prove that A ^cA{(i) ^ follows. We consider some interpretation /i which is 
a model for A. Since e does not occur in A, we can extend /x by setting /i(£) = 6 for some 
(5 > such that ^ is a model also for A^. As |=£^(q) /e, p. is also a model for J^, and 
hence p. is also a model for /. Thus, we have that A \=cA((i) I- CH 

Notice that Theorem 3.4 can be extended straightforwardly to the case in which the 
interpolant is a conjunction of inequalities. 

Thus, in case of strict inequalities. Theorem 3.4 gives us a way for constructing inter- 
polants with no need of expensive theory combination (as instead was the case in [McMil- 
lan 2005]). Moreover, thanks to it we can handle also negated equalities (0 ^ t) directly. 
Suppose our set S of input atoms (partitioned into A and B) is the union of a set S' of 
equalities and inequalities (both weak and strict) and a set of disequalities, and sup- 
pose that S' is consistent. (If not so, an interpolant can be computed from 5'.) Since 
CA{<^) is convex, S is inconsistent iff exists (0 ^ t) e such that 5' U {(0 7^ t)} is 
inconsistent, that is, such that both S" U {(0 < t)} and S" U {(0 > t)} are inconsistent. 

Therefore, we pick one element (0 ^ t) of at a time, and check the satisfiability of 
S' U {(0 < t)} and S' U {(0 > t)}. If both are inconsistent, from the two proofs we can 
generate two interpolants /~ and /+. We combine /+ and to obtain an interpolant / 
for {A, B): if (0 ^ t) £ A, then / is /+ V if (0 7^ t) £ B, then I is 1+ A as shown 
by the following lemma. 

Theorem 3.5 Interpolation for negated equalities. Let A and B two con- 
junctions of CAiffj atoms, and let ?i = (0 7^ t) be one such atom. Let g =^ {0 < t) and 

r^{o> t). 

Ifn e A, then let A+ A\ {n} U {.g}, A' A\ {n} U {I}, and B+ B' ^ B. 
Ifn G B, then let A+ = A- = A, B+ B\ {n} U {.g}, and B^ B\ {n} U {I}. 
Assume that j4+ A B^ |=£^(Q) -L '^"'^ ^^""^^ ^ ^ \^CA{iQ) -L. <^nd let and I~ be 
two interpolants for (A'^ , B^) and {A^ , B^) respectively, and let 

*f f i+y I- ifn e A 
\l+ AI- ifne B. 

Then I is an interpolantfor {A, B). 
Proof. We have to prove that: 

(i) A h£^(Q) I 
(ii) BAI h£^(Q) ^ 
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(iii) / ^ A and / ^ S. 

(i) If n G A, then A [=^cA(Q) 5 V /. By hypothesis, we know that \=^cA((j) 
and A" \=^cA{Q} ■ Then trivially A U {g} \=cA{Q) and A U {1} ^cA(m I~ ■ 
Therefore ^U{.g} ^cAm ^"^V/" and^U{/} (=£^(0) /" V/+, so that A hcAiQ) 
I. 

If n e B, then A+ = = A. By hypothesis A |=£^(Q) and A H£^(Q) ' so 
that A h£^(Q) ^■ 

(ii) If n e A, then B+ = = B. By hypothesis B A 1+ \=cAm ^ and B A 

hcAiQ) -L, SO that B AI hz:^(Q) ^■ 
If n G B, then B \=cAiQ) <? V ^, so that either B ^ g or B ^ I must hold. By 
hypothesis we have A /"'" ^£^(0) -L' so that B U {g} A H£^(Q) -L- If -B ^ g 
holds, then B A \=CA{Q} -L^ and hence B A I \=ca{Q) -L- Similarly, if B ^ Z 
holds, then B A \=CA{Q) -L. and so again B A I \=ca(,q) 

(iii) By the hypothesis, both /+ and contain only symbols common to A and B, so 
that / ^ A and / ^ B. □ 

Example 3.2. Consider the following sets of CAi'Q) atoms: 

A = {(0 ^xi^ 3x2 + 1), (0 xi + X2)} 

B {(0 = ^3 - 2x1 - 1), (0 = 1 - 2x3)}. 

To compute an interpolantfor (A, B), we first split n =^ (0 7^ xi — 3x2 + 1) into g ^ (0 < 

Xi — 3x2 + 1) t^nd Z = (0 < — .Ti + 3x2 — 1)> thus obtaining and A~ defined as in 
Theorem 3.5. We then generate two CA{Q)-proofs of unsatisfiability for A~^ A B and 

for A^ A B, and replace g in P+ with ge — (0 < Xi — 8x2 + 1 — e) and I in P~ with 

— (0 < — xi + 3x2 — 1 — e), obtaining P^ and P~ (we omit the names of the inference 
rules): 

{0 = a:i+a:2) (Q = X3 - 2xi - 1) (0 = 1-2x3) 
(0 < XI - 3x2 + 1 -e) (0<xi+X2) (0<X3- 2x1-1) (0 < 1 - 2x3) 
(0 < 4X1 + 1 - e) (0 < -4X1 - 1) 



P+ (0 < -e) 

(0 = xi+X2) (0 = X3- 2x1-1) (0 = 1-2x3) 



(0 < -XI +3x2 - 1 -£) (0<-xi-X2) (0 < -X3 + 2x1 + 1) (0< -1 + 2x3) 
(0 < -4X1 - 1 - e) (0 < +4X1 + 1) 

pr =^ WT^) 

We then compute the two interpolants from P^ and I~ from P^T : 

/+ (0 < 4x1 + 1 - e) 4" (0 < -4a;i - 1 - e). 

Therefore, according to Theorem 3.4 the two interpolants for (A'^ , B) and I" for 
iA~,B) are: 

1+ ^ (0 < 4x1 + 1) /- = (0 < -4x1 - 1). 

Finally, since n £ B, according to Theorem 3.5, the interpolant I for {A, B) is 

/ = /+ V = (0 < 4x1 + 1) V (0 < -4x1 - !)• 
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3.3 Obtaining stronger interpolants 

We conclude this Section by illustrating a simple technique for improving the strength of 
interpolants in CA{Q). The technique is orthogonal to our proof-generation algorithm 
described in §3. 1 .2, and it is therefore of independent interest. It is an improvement of the 
general algorithm of [McMillan 2005] (and outlined in §2.3.1) for generating interpolants 
from £^(Q)-proofs of unsatisfiability. 

Definition 3.6. Given two interpolants Ii and I2 for the same pair {A, B) of conjunc- 
tions of £yl(Q)-literals, we say that Ii is stronger than I2 if and only if Ii ^cA{Q) -^2 but 

Our technique is based on the simple observation that the only purpose of the sum- 
mations performed during the traversal of proof trees for computing the interpolant (as 
described in §2.3.1) is that of eliminating A-local variables. In fact, it is easy to see that 
the conjunction of the constraints of A occurring as leaves in an £^(Q)-proof of unsat- 
isfiabihty satisfies the first two points of the definition of interpolant (Definition 2.2): if 
such constraints do not contain A-local variables, therefore, their conjunction is already 
an interpolant; if not, it suffices to perform only the summations constraints of A that are 
necessary to eliminate A-local variables. Moreover, such interpolant is stronger than that 
obtained by performing the summations with the coefficients found in the proof tree, since 
for any set of constraints {si, . . . , s„} and any set of positive coefficients {ci, . . . , c„}, 
si A . . . A s„ \=cAm J2"=i Ci * Si holds. 

According to this observation, our proposal can be described as: perform only those 
summations which are are necessary for eliminating A-local variables. 

Example 3.3. Considerthe following sets of CA{Q)-atoms: 

A {(0 < .Ti - 3.T2 + 1), iO<X2~ ix-3), (0 < a;4 - ^x^ - 1)} 

B {(0 < 3x5 - xi), (0 < X3 - 2x4)} 

and the following CA{Q)-proof of unsatisfiability of A A B: 

(0 < xi - 3x2 + 1) 3 * (0 < X2 - ix3) 

(0<xi-X3 + l) 2 * (0 < X4 - fx5 - 1) 

(0 < xi — X3 + 2x4 — 3x5 — 1) (0 < 3x5 — xi) 

(0 < -X3 + 2x4 - 1) (0 < X3 - 2x4) 

Here, the variable X2 is A-local, whereas all the others are AB-common. The interpolant 
computed with the algorithm of ^2.3.1 is 

(0 < xi — X3 + 2x4 — 3x5 ^1)1 

which is the result of the linear combination of all the atoms of A in the proof. However, 
in order to eliminate the A-local variable X2, it is enough to combine (0 < Xi — 3x2 + 1) 
(with coefficient 1 ) and (0 < X2 — (with coefficient 3), obtaining (0 < Xi — X3 + 1). 
Therefore, a stronger interpolant is 

3 

(0 < XI - X3 + 1) A (0 < X4 - -.X5 - 1). 
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The technique can be implemented with a small modification of the proof-based algo- 
rithm described in §2.3.1. We associate with each node in the proof P' (which is obtained 
from the original proof P by replacing inequalities from B with (0 < 0)) a list of pairs 
{coefficient, inequality). For a leaf, this list is a singleton in which the coefficient is 1 and 
the inequality is the atom in the leaf itself. For an inner node (which corresponds to an 
application of the COMB rule), the list / is generated from the two lists li and I2 of the 
premises as follows; 

(1) Set I as the concatenation of li and I2; 

(2) Let ci and C2 be the coefficients used in the COMB rule. Multiply each coefficient c' 
occurring in a pair (c^, < ti) of I by ci if the pair comes from li, and by C2 otherwise; 

(3) While there is an A-local variable x occurring in more than one pair (c', < t) of l'} 

(a) Collect all the pairs (c'-, < t^) in which x occurs; 

(b) Generate a new pairp = (1,0 < c[ * ti); 

(c) Add p to I, and remove all the pairs (c'^, < U). 

After having applied the above algorithm, we can take the conjunction of the inequalities 
in the list associated with the root of P' as an interpolant. 

Theorem 3.7. Let P be a CA{Q)-proof of unsatisfiability for a conjunction AABof 
inequalities, and P' be obtained from P by replacing each inequality of B with (0 < 0). 
Let Z =5 (ci, < ti), . . . , (c„, < tn) be the list associated with the root of P', computed 
as described above. Then I = A"=i(0 ^ ^0 interpolant for (A, B). Moreover, I is 
always stronger than or equal to the interpolant obtained with the algorithm of %2. 3.1 for 
the same proof P' . 

Proof. By induction on the structure of P' , it is easy to prove that, for each constraint 
(0 < t) in P' with its associated Ust Z = (ci, < ti), . . . , (c„, < 

(1) ^hAr=i(0<i.);and 

(2) (0<t) = Er=iC.-(0<tO 

Since the root of P' is an interpolant for {A, B), this immediately proves the theorem. □ 

4. FROM SMT(VC) SOLVING TO SMT{V£) INTERPOLATION 

Several interesting verification problems can be encoded using only a subset of CA, the 
theory of Difference Logic CDC), either over the rationals (P£(Q)) or over the integers 
(2?£(Z)). T>C is much simpler than CA, since in VC all atoms are inequalities of the form 
(0 < y — X + c), where x and y are variables and c is an integer constant. ^ Equalities 
can be handled as conjunctions of inequalities. Here we do not consider the case when we 
also have strict inequalities {0 < y — x + c) and disequalities (0 y — x + c), because in 
'DC{Q) they can be handled in a way which is similar to that described in §3.2 for CA{Q), 
whilst in VC{Z) a strict inequality (0 < y — a; + c) can be rewritten a priori into a weak one 
(0 < y — x + c—1), and a disequality can be replaced by a disjunction of strict inequalities. 



°That is, X occurs in t. 

^Notice that we can assume w.l.o.g. tliat all constants are in Z, because, if this is not so, then we can rewrite the 
whole formula into an equivalently-satisfiable one by multiplying all constant symbols occuning in the formula 
by their greatest common denominator. 
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Very efficient solving algorithms have been conceived for VC [Cotton and Maler 2006; 
Nieuwenhuis and Oliveras 2005]. In this section we present a specialized technique for 
computing interpolants in VC which exploits such state-of-the-art decision procedures. 
Since a set of weak inequalities in VC is consistent over the rationals if and only if it is 
consistent over the integers, our algorithm is applicable without any modifications to both 
VC{Q) and P/:(Z) (see e.g. [Nieuwenhuis and Oliveras 2005]). 

Many SMT solvers use dedicated, graph-based algorithms for checking the consistency 
of a set of r'£(Q) atoms [Cotton and Maler 2006; Nieuwenhuis and Oliveras 2005]. In- 
tuitively, a set S of VC{Q) atoms induces a graph whose vertexes are the variables of the 
atoms, and there exists an edge x —t y for every (0 < ?/ — a; + c) G S*. S" is inconsistent if 
and only if the induced graph has a cycle of negative weight. 

We now extend the graph-based approach to generate interpolants. Consider the inter- 
polation problem (A, B) where A and B are sets of inequalities as above, and let C be (the 
set of atoms in) a negative cycle in the graph corresponding to ^ U i?. 

If C C A, then A is inconsistent, in which case the interpolant is _L. Similarly, when 
C C B, the interpolant is T. If neither of these occurs, then the edges in the cycle can be 
partitioned in subsets of A and B. We call maximal ^-paths of C a path xi — ^ . . . ^ > 
Xn such that (l) Xi Xi+i E A for i G [l,n — 1], and (ll) C contains x' ^ xi and 

Xn x" that are in B. Clearly, the end-point variables Xi,Xn of the maximal A-path 
are such Xi,Xn d: A and xi,Xn d: B. Let the summary constraint of a maximal A-path 
xi — U . . . > Xn be the inequality < Xn — xi + Yll=i '^i- 

Theorem 4.1. The conjunction of summary constraints of the A-paths of C is an 
interpolant for {A, B). 

Proof. Using the rules for CA{'Q) of Figure 3, we build a deduction of the summary 
constraint of an maximal A-path from the conjunction of its corresponding set of con- 
straints Ar=/(o < ^i+i - + ci)- 

(0 < X2 — Xi + Ci) {0 < X3 — X2 + C2) 

(0 < X3 - xi + ci + C2) {0 < X4 - X3 + C3) 

{0 < Xn - Xn-1 + Cn-l) 
(0 < Xn-Xl+Y:7Zl Ci). 

Hence, A entails the conjunction of the summary constraints of all maximal A-paths. 
Then, we notice that the conjunction of the summary constraints is inconsistent with B. 
In fact, the weight of a maximal A-path and the weight of its summary constraint are 
the same. Thus the cycle obtained from C by replacing each maximal A-path with the 
corresponding summary constraint is also a negative cycle. Finally, we notice that every 
variable x occurring in the conjunction of the summary constraints is an end-point variable, 
and thus x < A and x <B. □ 

A final remark is in order. In principle, in order to generate a proof of unsatisfiability for 
a conjunction of VC{Q) atoms A KB, the same rules used for CA{Q) [McMillan 2005] 
could be used. For instance, it is easy to build a proof which repeatedly applies the COMB 
rule with ci = C2 = 1. In general, however, the interpolants generated from such proofs 
are not P£(Q) formulas anymore and, if computed starting from the same inconsistent 
set C, they are either identical or weaker than those generated with our method. In fact, 
it is easy to see that, unless our technique of §3.3 is adopted, such interpolants are in the 
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form (0 < s-'^- Aj(0 — ^i) is the corresponding interpolant generated with our 

graph-based method. 



Example 4.1. Considerthe following sets of'DC{Q) atoms: a- 

B - 

A ^ {(0 < XI - X2 + 1), (0 < X2 - Xg), (0 < X4 - X5 - 1)} 

B = {{Q<X5-xi),{Q<X3-Xi-l)}. 

corresponding to the negative cycle on the right. It is straightforward to see from the graph 
that the resulting interpolant is (0 < Xi — X3 + 1) A (0 < CC4 — 2:5 — 1), because the first 
conjunct is the summary constraint of the first two conjuncts in A. 

Applying instead the rules of Figure 3 with coefficients 1, the proof of unsatisfiability is: 




(0 < XI -2:2 + 1) (0 < X2 - x-i) 
{0 <xi-X3 + 1) 



(0 < a;4 — X5 — 1) 



(0 < xi — 0:3 + 0:4 — X5) 



[0 < X5 — xi) 



(0 < —X3 + X4) 



(0 < X3 — X4 — 1) 



(0 < -1) 



By using the interpolation rules for CA{Q), the interpolant we obtain is (0 < xi — a;3 + 
X4 — Xr,), which is not in "DCIQ), and is weaker than that computed above: 

(0 < a;i - a;2 + 1) [0 < X2 - X3) 

{0<Xl-X3 + 1) (0 < X4 - X5 - 1) 

(0 < XI - xs + X4 - X5) (0 < 0) 

(0 < XI - xa +14 - xs) (0 < 0) 

(0 < xi — X3 + X4 - X5) 



Notice that, if instead we apply our technique of%3.3, then the CA{Q)-interpolant gener- 
ated frotn the above proof is identical to the 'DC{<Q>) one above. 



5. FROM SMT{UTVVI) SOLVING TO SMT{UTVVI) INTERPOLATION 

The Unit-Two- Variables-Per-Inequahty (UTVPT) theory is a subtheory of linear arith- 
metic, in which all constraints are in the form (0 < axi + 6x2 + fc), where fc is a numer- 
ical constant, a,b £ {—1, 0, 1}, and variables Xi, X2 range either over the rationals (for 
UTVVI{Q)) or over the integers (for UTVVX{1)). Consequently, VC{Q) is a subthe- 
ory ofUTVVXiQ), which is itself a subtheory of CA^Q), and VCiZ) is a subtheory of 
UTVVI{Z), which is itself a subtheory of CAiZ). 

As for DC, UTVVI can be treated more efficiently than the full CA, and several spe- 
cialized algorithms for UTVVI have been proposed in the literature. Traditional tech- 
niques are based on the iterative computation of the transitive closure of the constraints 
[Harvey and Stuckey 1997; Jaffar et al. 1994]; more recently [Lahiri and Musuvathi 2005] 
proposed a novel technique based on a reduction to VC, so that graph-based techniques 
can be exploited, resulting into an asymptotically-faster algorithm. We adopt the latter 
approach and show how the graph-based interpolation technique of §4 can be extended to 
UTVVI, for both the rationals (§5.1) and the integers (§5.2). 
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UTVVIiQ) constraints 


VC{Q) constraints 


(0 < xi - X2 + k) 
(0 < ~xi -X2 + k) 
(0 <xi+X2 + k) 
(0 < -XI + k) 
(0 < xi + k) 


(0 < x+ - + k), (0 < X2 " x^ + k) 
(0 <x^ - x^ + k), (0 <X2 - x^ + k) 
(0 < x+ - + fc), {0<x'^-x~+ k) 
{0<x^ -x^ + 2-k) 
(0 < x+ -x~ + 2-k) 



Fig. 4. The conversion map from UTVVIiQ) to VC{Q). 

5.1 Graph-based interpolation for UTVVI on the Rationals 

We analyze first the simpler case of UTVPI{Q). Mine [Mine 2001] showed that it is 
possible to encode a set of UTVVI{Q) constraints into a VC{^^) one in a satisfiability- 
preserving way. The encoding works as follows. We use Xi to denote variables in the 
UTVVI{Q) domain and u, v for variables in the 'DC{Q) domain. For every variable Xi 
in lATVVXl'^), we introduce two distinct variables x^ and x'^ in P£(Q). We introduce a 
mapping T from X'£(Q) variables to UTWX{Q) signed variables, such that T{x^) = Xi 
and T{x~) = —Xi. T extends to (sets of) constraints in the natural way: T(0 < axi + 
hx2 + k) " (0 < aT(a;i) + 6T(.T2) + c),andT({c,;},) " {T(c,)}v We say that (.t+)- = 
x~ and {x^)^ = x'l . We say that the constraints (0 < u — u) and (0 < {v)~ — {u)^) 
s.t. u, w e {x^ , ^rli dual. We encode each UTVVI constraint into the conjunction 
of two dual P£(Q) constraints, as represented in Figure 4. For each VC{^^) constraint 
(0 < w - u + fc), (0 < T{v) - T(w) + k) is the corresponding UTVVIiQ) constraint. 
Notice that the two dual T>C{Q) constraints in the right column of Figure 4 are just different 
representations of the original UTV'P2{Q) constraint. (The two dual constraints encoding 
a single-variable constraint are identical, so that their conjunction is collapsed into one 
constraint only.) The resulting set of constraints is satisfiable in X'£(Q) if and only if the 
original one is satisfiable in UTVVUff) [Mine 2001; Lahiri and Musuvathi 2005]. 

Consider the pair {A,B) where A and B are sets of UTWI{Q) constraints. We apply 
the map of Figure 4 and we encode (A, B) into a T>C{Q) pair {A' , B'), and build the 
constraint graph G{A' A B'). If G{A' A B') has no negative cycle, we can conclude that 
A' A B' is ^^(Q) -consistent, and hence that A A i? is WTV7'Z(Q)-consistent; otherwise, 
A' ^B' is -inconsistent, and hence A AS is i^rV7'X(Q)-inconsistent [Mine 2001; 

Lahiri and Musuvathi 2005]. In fact, it is straightforward to observe that for any set of 
'DC{Q) constraints {Ci, . . . , C„, C} resulting from the encoding of some 14TVPT{Q) 
constraints, if A -Li C'j hc£(Q) C then A-Li ^(Cj) hwrvpi(Q) T{C). 

When A A B is inconsistent, we can generate an ^YTV7'X(Q)-interpolant by extending 
the graph-based approach used for 2?£((Q). 

Theorem 5.1. Let A A B be an inconsistent conjunction ofUTWT{Q)-constraints, 
and let G{A' AB') be the corresponding graph of'DC{Q)-constraints. Let I' be a 
interpolant built from G{A' A B') with the technique described in §4. Then I '= T(/') is 
an interpolantfor {A, B). 

Proof, (i) /' is a conjunction of summary constraints, so it is in the form Ci. There- 
fore A' |=i5£(Q) Ci for all i, and so by the observation above A \^uTvvi{Q) T(Ci). 
Hence, A \=^utv'PI(Q) I- (ii) From the P£(Q)-inconsistency of /' A B' we immediately 
derive that / A S is Z^TV7'I((Q))-inconsistent. (iii) I < A and / ^ B derive from /' ^ A' 
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Fig. 5. The constraint graph of Example 5.1. (We represent only one negative cycle with its con'esponding 
A-paths, because the other is dual.) 

and /' ^ B' by the definitions of T and the map of Figure 4. □ 

As with the VC{Q) case, in principle, it is possible to generate a proof of unsatisfiability 
for a conjunction ofUTVVI{Q) atoms A f\Bhy repeatedly applying the COMB rule for 
CA{Q) [McMillan 2005] with ci = C2 = 1. As with P/:(Q), however, the interpolants 
generated from such proofs may not be UTW2{Q) formulas anymore. Moreover, if 
computed starting from the same inconsistent set C and unless our technique of §3.3 is 
adopted, they are either identical or weaker than those generated with our graph-based 
method, since they are in the form (0 < J^i ti) s-t- Ai(0 ^ ^0 is the interpolant generated 
with our method. 

Example 5.1. Consider the following sets oflATV'PI{Q) constraints: 

^ = {(0 < -X2 -xi+ 3), (0 < a-i + a;3 + 1), 
(0 < -X3 - X4 - 6), (0 < a-5 + a;4 + 1)} 

B = {{0<X2+X3 + 3), iO<xe-X5~ 1), {0 < x^ ~ xq + 4)} 

By the map of Figure 4, they are converted into the following sets of'DC{Q) constraints: 



{(0< 


^1 


4 


+ 3),(0 < X- 


-4 


+ 3), 


(0< 


x^ 




+ i),(o<4 


— X3 


+ 1), 


(0< 


x^ 


— x^ 


- 6), (0 < 0:3 




-6), 


(0< 


x^ 




+ 1), (0 < 4 


x^ 


+ 1)} 


{(0< 


4 


— X2 


+ 3),(0<4 


— X3 


+ 3), 


(0< 


4 




-1),(0<X5- 




-1), 


(0< 


x^ 


-4 


+ 4),(0<X6- 


^4 


+ 4)} 



whose conjunction corresponds to the constraint graph of Figure 5. This graph has a 
negative cycle 

3 _ 1 + -6 _ 4 _ -1 _ 1 + -6 _ 3 + 
C_y - — - -^^2 ^ X ^ 3 ^ ^ ^ ^ ^ ^ "-^ 3 ^ ' 

Thus, A A B is inconsistent in UTWT{Q). From the negative cycle C we can extract 

the set of A' -paths {xj — ^ x^, x'^ — ^ x^}^ corresponding to the formula /' = (0 < 
X4 — a:; J — 2) A (0 < x^ —x'^ — b)^ which is an interpolant for (A', B'). I' is thus mapped 
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back into I '= T(/') = (0 < — .T2 — a;4 — 2) A (0 < 0:5 — 2:3 — 5), which is an interpolant 
for {A,B). 

Applying instead the CA{Q) interpolation technique of [McMillan 2005], we find the 
interpolant (0 < —X2 — X4 + — X3 — 7), which is not in UTVVX{Q) and is strictly 
weaker than that computed with our method. 

5.2 Graph-based interpolation for UTVVI on the Integers 

In order to deal with the more complex case of UTVPTCZ), we adopt a layered ap- 
proach [Sebastian! 2007]. First, we check the consistency in UTVPT{iJ) using the tech- 
nique of [Mine 2001]. If this results in an inconsistency, we compute an UTVVI{Q)- 
interpolant as described in §5.1. If the ZiTV7'X(Q)-procedure does not detect an inconsis- 
tency, we check the consistency in WTV7'X(Z) using the algorithm proposed by Lahiri and 
Musuvathi in [Lahiri and Musuvathi 2005], which extends the ideas of [Mine 2001] to the 
integer domain. In particular, it gives necessary and sufficient conditions to decide unsat- 
isfiability by detecting particular kinds of zero-weight cycles in the induced DC constraint 
graph. This procedure works in 0(n • m) time and 0{n + m) space, m and n being the 
number of constraints and variables respectively, which improves the previous 0(n^ • m) 
time and 0{n?) space complexity of the previous procedure of [Jaffar et al. 1994]. 

We build on top of this algorithm and we extend the graph-based approach of §5.1 for 
producing interpolants also in UTVVI{'L). In particular, we use the following reformula- 
tion of a result of [Lahiri and Musuvathi 2005]. 

Theorem 5.2. Let (j) be a conjunction oflA'TV'PI{1i) constraints s.t. (j) satisfiable 
in UTVVX{Q). Then (j) is unsatisfiable in UTWIili) iff the constraint graph G{(j)) 
generated from cj) has a cycle C of weight containing two vertices xf and x~ s.t. the 
weight of the path x~ ^ xf along C is odd. 

Proof. The "only if" part is a corollary of lemmas 1, 2 and 4 in [Lahiri and Musuvathi 
2005]. The "if" comes straightforwardly from the analysis done in [Lahiri and Musuvathi 
2005], whose main intuitions we recall in what follows. Assume the constraint graph G{(l>) 
generated from cp has one cycle C of weight containing two vertices xf and x~ s.t. the 
weight of the path x~ ^ x^ along C is 2fc + 1 for some integer value k. (Since C 
has weight 0, the weight of the other path xf x~ along C is —2k — 1.) Then, the 
paths x^ ^ xf and xf ^ x~ contain at least two constraints, because otherwise their 
weight would be even (see the last two lines of Figure 4). Then, x~ ^ xf is in the 
form x^ V ^ xf , for some v and n. From x'^ ^ v, we can derive the summary 
constraint {0 < v — x^ + {2k + 1 — n)), which corresponds to the IATV'PT{Z) constraint 
(0 < T{v) + Xi + {2k + l-n)). (This corresponds to Z - 2 applications of the TRANSITIVE 
rule of [Lahiri and Musuvathi 2005], I being the number of constraints in x^ xf.) 
Then, by observing that the UTVVT{'li) constraint corresponding to v xf is (0 < 
Xi - T(w) + n), we can apply the TIGHTENING rule of [Lahiri and Musuvathi 2005] to 
obtain (0 < Xi + [(2fc + 1 — 71 + 7i)/2j ), which is equivalent to {{) < Xi + k). Similarly, 
from xf ^ x~ we can obtain (0 < —Xi — fc — 1), and thus an inconsistency using the 
Contradiction rule of [Lahiri and Musuvathi 2005]. □ 

Consider a pair [A, B) of UTVVI{1) constraints such that A A B is consistent in 
UTVPJ{Q) but inconsistent in UTVVX{1). By Theorem 1, the constraint graph G{A' A 
B') has a cycle C of weight containing two vertices xf and x^ s.t. the weight of the 
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Fig. 6. WrVPJ(Z) interpolation. Case 1. Fig. 7. WrVT'J(Z) interpolation. Case 2. 

paths x~ and ^ x~ along C are 2fc + 1 and —2k — 1 respectively, for some 

value k £ Zi. Our algorithm computes an interpolant for {A, B) from the cycle C. Let 
Ca and Cb be the subsets of the edges in C corresponding to constraints in A' and B' 
respectively. We have to distinguish four distinct sub-cases. 

Case 1: Xi occurs in B but not in A Consequently, x'f and x~ occur in B' but not in 
A' , and hence they occur in Cb but not in Ca- Let /' be the conjunction of the summary 
constraints of the maximal Cyi-paths, and let / be the conjunction of the corresponding 
UTVPI{Z) constraints. 

Theorem 5.3. I is an interpolant for [A, B). 

Proof, (i) By construction, A h=wrv7'i(z) in §5.1. (ii) The constraints in /' 
and Cb form a cycle matching the hypotheses of Theorem 5.2, from which I A B is 
i^TV'PX(Z)-inconsistent. (iii) We notice that every variable , xj occurring in the con- 
junction of the summary constraints is an end-point variable, so that /' :< Ca and /' :< Cb, 
and thus I ^ Aand I ^ B. □ 

Example 5.2. Considerthe following set of constraints: 

S = {{0< xi -X2+ 4), (0 < -X2 -X3- 5), {0<X2+Xe- 4), (0 < x^ + X2 + 3), 
(0 < -xi +X3 + 2), (0 < -xe - X4), (0 < X4 - X5)}, 
partitioned into A and B as follows: 



(0 < .T3 -xi+2) 
A{ {Q< - X4) B< 

{0 < X4 ~ Xr,) 



{0<xi- X2 + 4) 
(0 < -X2 -X3~5) 

{0<X2+X6- 4) 

(0 < 2:5 +a;2 +3) 



Figure 6 shows a zero-weight cycle C in G{A' A B') such that the paths x^ ^ x^ 
and X2 ^ X2 have an odd weight f— 1 and 1 resp.) Therefore, by Theorem 5.2 A A B 
is lATVVI{1i)-inconsistent. The two summary constraints of the maximal Ca paths are 
(0 < — a;^) and (0 < x'^ — + 2). It is easy to see that / = (0 < —xq — x^) A{Q < 
X3 — xi + 2) is an UTV'PT{'L)-interpolant for {A, B). 

Case 2: Xi occurs in both A and B. Consequently, and x~ occur in both A' and B' . If 
neither xf nor .t~ is such that both the incoming and outgoing edges belong to Ca, then 
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the cycle obtained by replacing each maximal C^-path with its summary constraint still 
contains both and x^ , so we can apply the same process of Case 1. Otherwise, if both 
the incoming and outgoing edges of x^ belong to Ca, then we split the maximal C^-path 
ui — U . . . xf '^^^ > . . . Un containing xf into the two parts which are separated 
by x^: ui — U . . . x^ and xf _^±i^ _ ^ ^ fiL^ We do the same for x^ . Let /' 
be the conjunction of the resulting summary constraints, and let / be corresponding set of 
UTVVI{1) constraints. 

Theorem 5.4. / is an interpolant for {A, B). 

Proof, (i) As with Case 1, again, A \=utvvi(z) I- (ii) Since we split the maximal 
Ca paths as described above, the constraints in /' and Cb form a cycle matching the 
hypotheses of Theorem 5.2, from which I A B is ZYTV7'I(Z)-inconsistent. (iii) xf, x~ 
occur in both A' and B' by hypothesis, and every other variable x^ , xj occurring in the 
conjunction of the summary constraints is an end-point variable, so tiiat /' ^ Ca and 
/' ^ Cb, and thus I ^ Aandl ^ B. □ 

Example 5.3. Consider again the set of constraints S of Example 5.2, partitioned 
into A and B as follows: 

(0 < X3 - xi + 2) 



(0 < -X2 - X3 - 5) 
(0 < a;5 + 4 
(0 < a;4 — x^) 



{0<xi-X2+ 4) 

and the zero-weight cycle C ofG{A' A B') shown in Figure 7. As in the previous example, 
there is a path ^ of weight —1 and a path ^ of weight 1. In this case there 
is only one maximal Ca path, namely x\ ^ xt, . Since the cycle obtained by replacing it 
with its summary constraint (0 < x^ —x^''r2) does not contaiyiX2 , wesplitx^ ~> x'^ into 
two paths, ^ X2 and ^ x'^ , whose summary constraints are (0 < — x^ — 4) 
and {0 < x^ ^ X2 + 6) respectively. By replacing the two paths above with the two 
summary constraints, we get a zero-weight cycle which still contains the two odd paths 

X2 X2 and X2 ^ X2. Therefore, I {0 < X2 — X4 — A) A {0 < X3 ~ X2 + G) is an 
interpolant for {A, B). 

Notice that the UTVT''I{'L)-formula J = (0 < 2:3 — 0:4 + 2) corresponding to the 
summary constraint of the maximal Ca path x'\ ^ x'^ is not an interpolant, since JAB 
is not l4TVVT{1,)-inconsistent. In fact, if we replace the maximal Ca path ^ xt, 
with the summary constraint x^ xj^, the cycle we obtain has still weight zero, but it 
contains no odd path between two variables xf and x~ . 

Case 3: Xi occurs in A but not in B, and one of the paths xf ^ x~ or x~ ^ xf in C 
contains only constraints ofCA- In this case, a;^ and x~ occur in A' but not in B' . Suppose 
that x~ ^ xf consists only of constraints of Ca (the case xf ^ x~ is analogous). 
Let 2fc + 1 be the weight of the path x^ x^ (which is odd by hypothesis), and let 

— _ 2k _i_ 

C be the cycle obtained by replacing such path with the edge — > x'l in C. In the 
following, we call such a replacement tightening summarization. Since C has weight zero, 
C has negative weight. Let C^ be the set of ^^-constraints in the path ^ x'^ . Let /' 
be the P/I-interpolant computed from C* for [Ca \ C-^ U {(0 < a;+ - x^ + 2fc)}, Cb), 
and let / be the corresponding UTW2{Z) formula. 
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Fig. 8. WTVPJ(Z) interpolation. Case 3. Fig. 9. UTVVIiZ) interpolation. Case 4. 

Theorem 5.5. I is an interpolantfor {A,B). 

Proof, (i) Let P be the set of UTV'P1{'L) constraints in the path x~ ^ xf. Since 
the weight 2fc + 1 of such path is odd, we have that P Hwrv7'i(z) (0 ^ + (cf. 
page 23). Since P C A, therefore, A \=^utvt'X{i) (0 < Xi + k). By observing that 
{0 < xf — x^ + 2k) is the P£-constraint corresponding to (0 < + k) we conclude that 
Ca \ C'f' U (0 < x+ - X- + 2k) ^vc I' impHes that A\PU{0 < x, + k) \=utvvi{Z} L 
and so that A ^utvvi(Z) I- 

(ii) Since all the constraints in Cb occur in C, we have that B A I is UTVVI{Z)- 
inconsistent. 

(iii) Since by hypothesis all the constraints in the path x~ ^ xf occur in Ca, from 
I' ^ (Ca \ U {(0 < £+ - a;^ + 2k)}) we have that I ^ A. Finally, since all the 
constraints in Cs occur in C, we have that I ^ B. □ 

Example 5.4. Consider again the set S of constraints of Example 5.2, this time par- 
titioned into A and B as follows: 

" (0 < .Ti -X2+4) 



A{ So " ^ B{ (0<-.a-x4) 

* (0 < -X2 - - 5) ^ ' 

(0 < X2 + - 4) 



(0 < X5 +X2+i) 
(0 < —Xq — X4 

(0 < 2:4 — 2:5) 



Figure 8 shows a zero-weight cycle C of G{A' A B'). The only maximal Ca path is 
Xq ^ X2. Since the path a; J ^ has weight 1, we can add the tightening edge 

X2 ^ — ^- X2 to G{A' A B') (shown in dots and dashes in Figure 8), corresponding to the 
constraint (0 < x^ — x^)- Since all constraints in the path a; J ^ X2 belong to A', 
^' \^ {0 < X2 — x^). Moreover, the cycle obtained by replacing the path with 
the tightening edge a; J X2 has a negative weight (—1). Therefore, we can generate 
a VC-interpolant I' = (0 < X2 — Xq — 4) from such cycle, which corresponds to the 
liTV'PI{Z)-interpolant I =^ (0 < -X2 + xq - 4). 

Notice that, similarly to Example 5.3, also in this case we cannot obtain an interpolant 
from the summary constraint (0 < a;^ — a;g — 3) of the maximal Ca path Xq ^ a:^, as 
(0 < —X2 + xq ~- 2)) /\ B is not UTVPT{Z)-inconsistent. 

Case 4: Xi occurs in A but not in B, and neither the path xf ^ x^ nor the path x^ ^ xf 
in C consists only of constraints of Ca- As in the previous case, xf and a;^ occur in A' 

ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY. 



Efficient Generation of Craig Interpolants in Satisfiability IVIodulo Theories • 27 



but not in B' , and hence they occur in Ca but not in Cb- In this case, however, we can 
apply a tightening summarization neither to x'l x'^ nor to x'^ ^ x'l , since none of 
the two paths consists only of constraints of Ca- We can, however, perform a conditional 
tightening summarization as follows. Let and Cg be the sets of constraints of Ca 

and Cb respectively occurring in the path x~ ^ x'l , and let C a and C b be the sets of 

p p 

summary constraints of maximal paths in and C^. From C a U C b, we can derive 
x~ xf (cf. Case 3), where 2k + 1 is the weight of the path x^ ^ xf. Therefore, 
U h (0 < xj - x~ + 2k), and thus h C's ^ (0 < xj - x' + 2k). We say 
that (0 < — x'^ + 2k) is the summary constraint for ^ xf conditioned to C b- 
Using conditional tightening summarization, we generate an interpolant as follows. By 

replacing the path x~ xf with x~ xf , we obtain a negative- weight cycle C, 

as in Case 3. Let /' be the P£-interpolant computed from C for {Ca \ C^ U {(0 < 

xf — x^ + 2k)}, Cb \ C§), and let / be the corresponding UTVVX{'L) formula. Finally, 

P 

let Pb be the conjunction of WTVPX(Z) constraints corresponding to C g. 

Theorem 5.6. (Pb I) is an interpolant for {A, B). 

Proof, (i) We know that Ca \ C^ U {{0 < xf - x:r +2k)} h because /' is a 
P^-interpolant. Moreover, U |= (0 < x+ - x^ + 2k), and so U |= (0 < 

p 

xf — x^ + 2k). Therefore, Ca U Cb H ^^^^ Ayj Pb \=^utvvx(z) from which 

A [^urvvxci) [Pb ^ I)- 

(ii) Since /' is a P^-interpolant for [Ca \ C*^ U {(0 < xf - x^ + 2k)], Cb \ C§), 
r A (Cb \ C^) is 2?/:-inconsistent, and thus / A i? is Z^TV7'X(Z)-inconsistent. Since by 
constructions \=^uTvvx(Z) Pb, {Pb I) /\ B is ZYTVPX(Z)-inconsistent, 

(iii) From r ^ Cb\ C| we have that I ^ B, and from /' ^ \ U {(0 < 



2k)} that I ^ A. Moreover, all the variables occurring in the constraints in Cb 

are end-point variables, so that Cb d: Ca and Cb di Cb, and thus Pb d A and Pb d B. 
Therefore, (Pb I) < A and {Pb I) <B. □ 

Example 5.5. We partition the set S of constraints of Example 5.2 into A and B as 
follows: 



(0 < .Tl -2:2+4) 

^ ^ So 5 5 <; (0 < -.a - X,) 

' (0 < X5 + 2:2 + 3) ' ' 

(0 < X2 +XQ-A) 



(0 < .T3 -xi+2) 
(0 < —xq — Xi 
{0 < Xi — X5) 



Consider the zero-weight cycle C of C{A' A B') shown in Figure 9. In this case, neither 
the path xf nor the path x'^ ^ xf consists only of constraints of A', and thus 

we cannot use any of the two tightening edges xf — — ^ a;^ and — ^ — ^ xf directly 
for computing an interpolant. However, we can compute the summary x^ — ^ xf for 
X2 ^ xf conditioned to xf —!■ Xq, which is the summary constraint of the B-path 
xf ^ Xq, and whose corresponding UTVVI{'L) constraint is (0 < —Xq — X5). By 
replacing the path a;^ ^ xf with such summary, we obtain a negative-weight cycle C, 
from which we generate the VC- interpolant (0 < xf — xf — 3), corresponding to the 

ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY. 



28 • Cimatti, Griggio and Sebastiani 



UTVPICZ) formula (0 < xi—x^'-S). Therefore, the generatedlATV'PI{Z)-interpolant 
is (0 < —xq — x^) ^ {0 < xi ~ x^ — 3). 

As in Example 5.4, notice that we cannot generate an interpolant from the conjunction 
of summary constraints of maximal Ca paths, since the formula we obtain (i.e. (0 < 
xi + xq) h < xc, ~ xy, ~ 2)} is not inconsistent with B. 

6. COMPUTING INTERPOLANTS FOR COMBINED THEORIES VIA DTC 

In this Section, we consider the problem of generating interpolants for a pair of 71 U 72- 
formulas {A, B), and propose a method based on the Delayed Theory Combination (DTC) 
approach [Bozzano et al. 2006]. First, in §6.1 we provide some background on Nelson- 
Oppen (NO) and DTC combination methods, and recall from [ Yorsh and Musuvathi 2005] 
the basics of interpolation for combined theories using NO; then, we present our novel 
technique for computing interpolants using DTC (§6.2); in §6.3 we discuss the advantages 
of the novel method; finally, in §6.4, we show how our novel technique can be used to 
generate multiple interpolants from the same proof. 

6.1 Background 

6.1.1 Resolution proofs with NO vs. resolution proofs with DTC. One of the typical 
approaches to the SMT problem in combined theories, SMT(7i U 72), is that of combining 
the solvers for 7i and for 72 with the Nelson-Oppen (NO) integration schema [Nelson and 
Oppen 1979]. The NO framework works for combinations of stably-infinite, signature- 
disjoint theories % with equality. Moreover, it requires the input formula to be pure (i.e., 
s.t. all the atoms contain only symbols in one theory): if not, a purification step is per- 
formed, by recursively labeling terms t with fresh variables vt, and by conjoining the 
definition atom [vt = t) to the formula. This process is linear in the size of the input 
formula. For instance, the formula {f{x + iy) = g{2x — y)) can be purified into 
{f{v^^+^y) = g{v2x-y)) A [vx+Zy =x + iy) A {v2x-y ^2x- y)). 

In the NO setting, the two decision procedures for Ti and 72 cooperate by deducing and 
exchanging interface equalities^^ , that is, equalities between variables appearing in atoms 
of different theories (interface variables). 

With an NO-based SMT solver, resolution proofs for formulas in a combination 7i U 72 
of theories have the same structure as those for formulas in a single theory T. The only 
difference is that theory lemmas in this case are the result of the NO-combination of 7i and 
72 (i.e., they are 71 U 72-lemmas) (Figure 10 left). From the point of view of interpolation, 
the difference with respect to the case of a single theory T is that the 71 U 72 -interpolants 
for the negations of the 71 U 72-lemmas can be computed with the combination method of 
[Yorsh and Musuvathi 2005] whenever it applies (see §6.1.2). 

Recently, an alternative approach for combining theories in SMT has been proposed, 
called Delayed Theory Combination (DTC) [Bozzano et al. 2006]. With DTC, the solvers 
for 71 and 72 do not communicate directly. The integration is performed by the SAT solver, 
by augmenting the Boolean search space with up to all the possible interface equalities, 
so that each truth assignment on both original atoms and interface equalities is checked 



"As shown in [Barrett et al. 2002], the purification step is not strictly necessary. However, in the rest we shall 
assume that it is performed (as it is traditionally done in papers on combination of theories), since it makes the 
exposition easier 

^'^They deduce and exchange disjunctions of interface equalities if the theory is not convex. 
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7^ -lemma 






Tj-lemma 




Ti-lemma Ts-lemma \ 






Tj -lemma 




(NO) 



(DTC) 



Fig. 10. Different stnjctures of resolution proofs of unsatisfiability for Ti U 72-foi'mulas, using NO (left) and 
DTC (right). 



for consistency independently on both theories. DTC has several advantages wrt. NO, 
in terms of versatility, efficiency, and restrictions imposed to T-solvers [Bozzano et al. 
2006; Bruttomesso et al. 2008a], so that many current SMT tools implement variants and 
evolutions of DTC. 

With DTC, resolution proofs are quite different from those obtained with NO. There 
is no 71 U 72-lemma anymore, because the two 7^-solvers don't communicate directly. 
Instead, the proofs contain both 7i -lemmas and 72 -lemmas (Figure 10 right), and - impor- 
tantly - they contain also interface equalities. (Notice that 7; -lemmas derive either from 
7J-conflicts and from 7^ -propagation steps.) In this case, the combination of theories is 
encoded directly in the proofs (thanks to the presence of interface equalities), and not "hid- 
den" in the 7^ U 7^-lemmas as with NO. This observation is at the heart of our DTC-based 
interpolant combination method. 

Example 6.1. Consider the following formula (p: 

cj,^ (ai - /(as)) A (6i = /(62))A 

- 02 = 1) A (y - &2 = 1) A (ai + y = 0) A (6i + y = 1). 

(j) is expressed over the combined theory £UJ- U CA{Q>): the first two atoms belong to 
EUJ-, while the last four belong to CA{Q). 

Using the NO combination method, (j) can be proved unsatisfiable as follows: 



(7) From the conjunction (y — 02 = 1) A (y — 62 = 1), the CA{'Q)-solver deduces the 
interface equality (02 = 62), which is sent to the £UJ--solver; 

(2) From (02 = ^2) <^nd the conjunction (oi = f{o-2)) A (61 ~ /(&2)) the EhiT-solver 
deduces the interface equality (ai = hi), which is sent to the CA{'^)-solver; 

(5) Together with the conjunction (oi + y = 0) A (foi + y = 1), (ai = 61) causes an 
inconsistency in the CA{Q)-solver; 

(4) The ZUT U £^(Q) conflict-set generated is {(y — 02 = 1), (y — ^2 = 1), {o-i = 
1(0-2)), (bi = /(&2)), (oi + y = 0), (foi + y = 1)}, corresponding to the £UJ- U 
CA{Q)-lemma C = -(y - 02 = 1) V -(y - 62 = 1) V -(ai = /(a2)) V -(61 = 
f{h2)) V -(ai + y = 0) V -(61 + y = 1). 

ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY. 



30 • Cimatti, Griggio and Sebastiani 



The corresponding NO proof of unsatisfiability for <j) is thus: 
C {b^+y = l) 

■_ (ai + y = 0) 

■_ {y-b2 = 1) 

■_ (y-a2 = 1) 

■_ (fel = /fe)) 

■_ (ai = f(a2)) 

± 

With DTC, the Boolean search space is augmented with the set of all possible interface 
equalities Eq = {{ai = a2),{ai = &i),(ai = 62), (a2 = ^i),(a2 = ^2), (&i = ^2)}, 
so that the DPLL engine can branch on them. If we suppose that the negative branch is 
explored first ( and we assume for simplicity that the T -solvers do not perform deductions), 
using the DTC combination method 4> can be proved unsatisfiable as follows: 

(7) Assigning (02 = 62) to false causes an inconsistency in the CA{Q)-solver, which 
generates the CA{Q)-lemma Ci = -'{y — 02 = 1) V -^{y — 62 = 1) V (02 = 62)- C'l 
is used by the DPLL engine to backjump and unit-propagate {0,2 — 62)/ 

(2) After such propagation, assigning {a\ = 5i) to false causes an inconsistency in the 
EUT-solver, which generates the £IAJ- -lemma C'2 '= ~'(ai = /(a2)) V ^(fei = 
f{i>2)) V ^(a2 = 62) V (ai = 61). C2 is used by the DPLL engine to backjump 
and unit-propagate (ai = 61); 

(5) This propagation causes an inconsistency in the CA{'^)-solver, which generates the 
CAiQ)-lemma C3 = -(y - 02 = 1) V ^(y - 62 = 1) V ^(oi = 61); 

(4) After learning C3, the DPLL engine detects the unsatisfiability of (j). 

The corresponding DTC proof of unsatisfiability for cj) is thus: 

Ci {y-a2 = 1) 

■_ {y-b2 = 1) 

_j C2 

■_ 

■_ {bi+y = l) 

■_ {ai+y = 0) 

(ai = f{a2)) 

± 

An important remark is in order It is relatively easy to implement DTC in such a way 
that, if both 7i and T2 are convex, then all T-lemmas generated contain at most one pos- 
itive interface equality. This is due to the fact that for convex theories T it is possible to 
implement efficient T -solvers which generates conflict sets containing at most one negated 
equality between variables [Bozzano et al. 2005]. (E.g., this is true for all the 7^-solvers 
on convex theories implemented in MathS AT.) Thus, since we restrict to convex theories, 
in the rest of this paper we can assume w.l.o.g. that every T-lemma occurring as leaf in 



^^We recall that, if T is convex, then A* A /\j -^li \=t -L iff A -^li \=t 1. for some i, where the li's are 
positive literals. 
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a resolution proof 11 of unsatisfiability deriving from DTC contains at most one positive 
interface equality. 

6.1.2 Interpolation with Nelson-Oppen. The work in [Yorsh and Musuvathi 2005] 
gives a method for generating an interpolant for a pair [A, B) of Ti U 72-formulas s.t. 
A l\ B |=TiuT2 -L by means of the NO schema. As in [Yorsh and Musuvathi 2005], we 
assume that A and B have been purified using disjoint sets of auxihary variables. We recall 
from [Yorsh and Musuvathi 2005] a couple of definitions. 

Definition 6.1 AB-mixed equality. An equality between variables (a = b) is an AB- 
mixed equality iff a B and b ^ A(ot vice versa). 

Definition 6 . 2 Equality-interpolating theory. A theory T is said to be equality-interpolating 
iff, for all A and S in T s.t. A A B \=j- (a = b) and for all ^i?-mixed equalities (a = 6), 
there exists a term t such that A A B \=t {a = t) A {t = b) and t ^ A and t < B. 

The work in [Yorsh and Musuvathi 2005] describes procedures for computing the term 
t from an AB-m\xeA interface equality (a = 6) for some convex theories of interest, 
including ElAT, £.4(iQ), and the theory of lists. 

Notationally, with the letters x, Xi, y, yi, z we denote generic variables, whilst with the 
letters a, a;, and b, bi we denote variables s.t. ^ B and bi ^ A\ hence, with the letters 
Ci we denote generic ^i?-mixed interface equalities in the form [ai ~ bi); with the letters 
77, rji we denote conjunctions of literals where no AB-mixed interface equality occurs, 
and with the letters /i, pi we denote conjunctions of literals where AB-mixed interface 
equalities may occur. If pi (resp rji) is /\ ■ k, we write -i/u, (resp. -^r]i) for the clause 



Let A A B he a Ti U 72 -inconsistent conjunction of 7i U 72-literals, such that A = 
Ai A A2 and B = Bi A B2 where each Ai and Bi is T^-pure. The NO-based method of 
[Yorsh and Musuvathi 2005] computes an interpolant for {A, B) by combining T^-specific 
interpolants for subsets of A, B and the set of entailed interface equalities {ej}j that are 
exchanged between the T^-solvers for deciding the unsatisfiability of A A B. In particular, 
let Eq = {ej}j be the set of entailed interface equalities. Due to the fact that both 71 and 
T2 are equality-interpolating, it is possible to assume w.l.o.g. that Eq does not contain AB- 
mixed equalities, because instead of deducing an AB-mixed interface equality (a = 6), a 
T -solver cw always deduce the two corresponding equalities {a = t) A{t ^ b). (Notice 
that the other T-solver treats the term t as if it were a variable [Yorsh and Musuvathi 
2005].) Let A' = A{J {Eq [ A) and B' " B \J {Eq [ B). Then, 7^-specific partial 
interpolants are combined according to the following inductive definition: 



where e is either an entailed interface equality or _L, and I\i g, (e) is a 7^ -interpolant for 
{A' U -^e,B') if e < A, and for {A',B' U ^e) otherwise (if e ^ B). The computed 
interpolant for {A, B) is then Ia,b (^)- We refer the reader to [Yorsh and Musuvathi 2005] 
for more details. 




_L if e e A 

T if e e B 

(/^, V Ve„eA'^^.s(ea)) A Ae,ei3'^A,s(e6) otherwise, 



(11) 
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6.2 From DTC solving to DTC Interpolation 

We now discuss how to extend the DTC method to interpolation. As with [ Yorsh and Musu- 
vathi 2005], we can handle the case that Ti and T2 are convex and equality-interpolating. 
The approach to generating interpolants for combined theories starts from the proof gen- 
erated by DTC. Let Eq be the set of all interface equalities occurring in a DTC refutation 
proof for a 7i U 72-unsatisfiable formula = A A B. 

In the case Eq does not contain ^B-mixed equalities, that is, Eq can be partitioned into 
two sets {Eq \ B) " {{x = y)\{x = y) < A and {x = y) j< B} and {Eq i B) " 
{{x = y)\{x = y) ^ B}, no interpolant-combination method is needed: the combination 
is already encoded in the proof of unsatisfiability, and a direct application of Algorithm 
1 to such proof yields an interpolant for the combined theory Ti U T2. Notice that this 
fact holds despite the fact that the interface equalities in Eq occur neither in A nor in B, 
but might be introduced in the resolution proof 11 by T-lemmas. In fact, as observed in 
[McMillan 2005], as long as for an atom p either p ^ A or p ^ B holds, it is possible to 
consider it part of A (resp. of B) simply by assuming the tautology clause pV ^p to be 
part of A (resp. of B). Therefore, we can treat the interface equalities in {Eq \ B) as if 
they appeared in A, and those in {Eq J, B) as if they appeared in B. 

When Eq contains AB-mixed equalities, instead, a proof-rewriting step is performed 
in order to obtain a proof which is free from AS-mixed equalities, that is amenable for 
interpolation as described above. The idea is similar to that used in [Yorsh and Musuvathi 
2005] in the case of NO: using the fact that 7i and T2 are equality-interpolating, we reduce 
this case to the previous one by "splitting" every AB-m\xeA interface equality (a; — hi) 
into the conjunction of two parts {a.i = ti) A {ti = hi), such that (a^ = ti) < A and 
{t'i = hi) < B. The main difference is that we do this a posteriori, after the construction 
of the resolution proof of unsatisfiability 11. In order to do this, we traverse 11 and split 
each AS-mixed equality, performing also the necessary manipulations to ensure that the 
result is still a resolution proof of unsatisfiability. 

We describe this process in two steps. In §6.2.1 we introduce a particular kind of res- 
olution proofs of unsatisfiability, called ie -local, and show how to eliminate ^i?-mixed 
interface equalities from ie -local proofs; in §6.2.2 we show how to implement a variant of 
DTC so that to generate ie -local proofs. 

6.2.1 Eliminating AB-mixed equalities by exploiting ie-locality 

Definition 6.3 ie -local proof. A resolution proof of unsatisfiability 11 is local with re- 
spect to interface equalities (ie -local) iff the interface equalities occur only in subproofs 
li'i of n, such that within each Ilf : 

(i) all leaves are also T-lemma leaves of 11; 

(ii) all the pivots are interface equalities; 

(iii) the root contains no interface equality; 

(iv) every right premise of an inner node is a leaf T-lemma containing exactly one positive 
interface equality. 

As a consequence of this definition, we also have that, within each Ilf in 11: 



^'^ We have adopted the graphical convention that at each resolution step in a n|? subproof, if (a; = hi ) is the 
pivot, then the premises containing ^(ai = hi) and (a^ = hi) are the left and right premises respectively. 
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(v) all nodes are Tl U 72-valid; {Proof sketch: they result from Boolean resolution steps 
from 7i -valid and 72-valid clauses, hence they are Ti U 72-valid.) 

(vi) the only leaf T-lemma which is a left premise contains no positive interface equal- 
ity. {Proof sketch: we notice that, in a resolution step if C'a contains no positive 
interface equality, at least one between Ci and C2 contains no positive interface equal- 
ity; since by (iv) the right premise contains one positive interface equality, only the 
left premise contains no positive interface equality. Thus the leftmost leaf T-lemma 
of contains no positive interface equality.) 

(vii) if an interface equality Cj occurs negatively in some T-lemma Cj, then Cj occurs 
positively in a leaf T-lemma Ck which is the right premise of a resolution step whose 
left premise derives from Cj and other T-lemmas. {Proof sketch: suppose that -iCj 
occurs in Cj but Cj does not occur in any such Ck- Then e_, can not be a pivot, hence 
-iCj occurs in the root of , thus violating (iii).) 

Intuitively, in ie -local proofs of unsatisfiability all the reasoning on interface equal- 
ities is circumscribed within lif subproofs, which are linear sub-proofs involving only 
T-lemmas as leaves, starting from the one containing no positive interface equality, each 
time eliminating one negative interface equality by resolving it against the only positive 
one occurring in another leaf T-lemma. 

Example 6.2. Consider the EUT U LAiff) formula (p of Example 6.1, and the T- 
lemmas Ci, C2 and C3 introduced by DTC to prove its unsatisfiability. The proof W of 
Example 6.1 is not ie -local, because resolution steps involving interface equalities are 
interleaved with resolution steps involving other atoms. The following proof li' , instead, is 
ie -local: all the interface equalities are used as pivots in the 11'® subproof: 



C3 C2 



[pivot (ai = bi)] 

[pivot (02 = 62)7 



{a2 + 2 = 1) 



{ai+z = 0) 



{z- X2 = 1) 



(ai = /(xi)) 



(12 = f{x2)) 



{z - XI = 1) 



Ci = ( a2 = b2 ) V -^(y - aa = 1) V -.{j/ - 62 = 1) 

C2 = ( ai =bi ) V -(61 = /(ba)) V -(ai = /(aa)) V ^( aa = &2 ) 

C3 = -(ai + y = 0) V -{61 + y = 1) V ^( ai = bi ). 

If n is an ie -local proof containing Ai?-mixed interface equalities, then it is possible 
to eliminate all of them from 11 by applying Algorithm 2 to every Hf subproof of 11. In a 
nutshell, each Hf subproof is explored bottom-up, starting from the right premise of the 
root, each time expanding the rightmost side T-lemma in the form = (a^ = 6j)V-i?/j s.t. 
(fli = bi) is ^i3-mixed into the (implicit) conjunction of two novel T-lemmas C- = (a; = 
ti) V -i77i and Cf (ti = 5;) V -177^ (step (4)), where ti is the AB -pure term computed 
from Ci as described in §6.1.2. Then the resolution step against Ci is substituted with the 
concatenation of two resolution steps against C'i and C'/ (step (5)) and then the substitution 
^(flj = bi) I — > ^(flj ~ ti) V -^{ti = hi) is propagated bottom-up along the left subproof 
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Algorithm 2: Rewriting of 11'^ subproofs 



(1) Let (T be a mapping from negative AB-mixed interface equalities to a disjunction of 
two negative interface equalities, such that cr[-^{ai = bi)] ^ ~^{ai = tj) V -^{ti = 6,) 
and ti is an AB-pme term as described in §6.1.2. Initially, a is empty. 

(2) LetCi = (oi = 6i)V-i/ii be the right premise T-lemma of the root of the n'^ subproof. 

(3) Replace each -^{aj ~ bj) in Ci with a[-^{aj = bj)], to obtain C* = {ai ~ bi) V -^r]i. 
If (oi = bi) is not AB-mixed, then let 11 be the subproof rooted in the left premise, 
and go to step (7). 

(4) Split C* into = (a, = t,) V -^V^ and = (t, = 6.) V -7?^. 

(5) Rewrite the subproof 







n 






into 


-•(flj = U) V -.(ti = hi) V -^iik 










^{k = bi)\J -.?/fc V ^rji 









where -177^ is obtained by -i/ifc by substituting each negative AB-mixed interface 
equality ^{aj ~ bj) with a[-^{aj = bj 



(6) Update a by setting (7[-i(ai = bi)] to -1(0^ = t^) V 

■ 

(7) If n is of the form - 



hi) 



-, set Ci to Cj and go to step (3). 

(8) Otherwise, 11 is the leaf -i(ai = ti) V ^(ii = fei) V ^/ifc. In this case, replace each 
^{aj ~ bj) in -i/Xfc with (T[^(aj = bj)], and then exit. 



n. Notice that C'^ and Cf are still Ti -valid because % is EquaUty-interpolating and ?7i does 
not contain other Ai?-mixed interfaced equalities. 

Example 6.3. Consider the formula (j) of Example 6. 1 and its ie -local proof of unsat- 
isfiability of Example 6.2. Suppose that (j) is partitioned as follows: 

A {a, = /(aa)) A (y - = 1) A (ai + y = 0) 

B 'd (&i - /(62)) A (y - 62 = 1) A (fei + y = 1) 

In this case, both interface equalities (ai = bi) and (02 = 62) are AB-mixed. Consider 
the n'^ subproof of Example 6.2: 

Ci ( 02 = b2 ) V ^{y - aa = 1) V ^{y - 62 = 1) 
C2 = ( ai = fei ) V -{61 = /(fe)) V -(ai = /(aa)) V ^( 02 = fe ) 
C3 = -{ai + ?/ = 0) V -(61 + = 1) V ^( ai = fei ) 

01 =^ -(ai + y = 0) V -(61 +y = l)y -(fei = /{62)) V -(ai = /(a2)) V -(02 = 62) 

02 = -(ai + y = 0) V ^(bi + y = 1) V ^(61 = f{b2)) V ^(ai = /(a2)) V ^(j/ - 02 = 1) V ^fe - ^2 = 1) 

The first T -lemma processed by Algorithm 2 is Ci. Using the technique of [Yorsh and 
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Musuvathi 2005], (a2 = 62) is split into (02 ~ y 
obtaining C(, C" and the new proof (in step (5)): 



1) A (y - 1 = 62) (step (4)), thus 



&2 = -iy 



= y - 1) V ^(j/ - aa = 1) V ^{y - 62 = 1) 
1 = 62) V ^{y - 02 = 1) V ^{y - 62 = 1) 
- 1 = 62) V ^(ai + = 0) V ^(fei +y = l)y ^(fei 
(ai = /(a2)) V ^fe - 02 = 1) V ^(y - = 1) 



/(f'2))V 




r/ien, CT[-i(a2 = fo2)] '■s ■s^f fo -1(02 = y — 1) V — 1 = 62) (step (6)), and a new 
iteration of the loop (3)-(7) is performed, this time processing C2. First, -i(a2 = 62) is 
replaced by ^(02 = y — 1) V -'(y ~ I = 62) (step (3)). Then, (ai = bi) can be split into 
(ai = f{y — 1)) A {f{y — 1) ~ bi) (step (4)). After the rewriting of step (5), the proof is: 



a' 



0'i' 



(ai =/(y-l)) V-(bi 
-(y-l = b2) 
(/fe-l) = fei)V-(bi 
-(y-l = b2) 

-.(ai + 1/ = 0) V ^(fei + J/ = 1) V ^(bi 
-("2 = ?/-!) V^(i/- 1 = 62) 
-(ai = /(?/ - 1)) V -(ai + y = 0) V ^{bi + y 
-.(ai = /(a2)) V^(a2 = 62) 



/(62)) V -(ai = /(aa)) V -(aa = y - 1)V 
f{b2)) V -(ai = f(a2)) V -(02 = 2/ - 1)V 
/(b2)) V-(ai =/(a2))V 

1) V-(bi =/(62))V 



C3 Cj 




0'/ 




0'l 










02 



Finally, C3 is processed in step (8), -i(ai = 61) §efs replaced with 
~'{f{y ~ 1) = ''i)' '^"'^ the following final proof 11"^ is generated: 



ia, =/(y-l))V 



e'l' 



a' 



e'l 



e; 



sMc/i f/iflf C3 = C3[^(ai = 61) i-^ ->{ai = f(y 



Q2 

-l))V-(/(2/-l) 



The following theorem states that Algorithm 2 is correct. 

Theorem 6.4. Let II be a n'^ subproof and let H' /je f/ze result of applying Algorithm 
2 to n. Then: 

(a) n' c/oes «of contain any AB -mixed interface equality; and 

(b) H' is a valid subproof with the same root as H. 

Proof. 

(a) Consider the T-lemma Ci of Step (3). By item (vii) of Definition 6.3, all negative in- 
terface equalities occurring in Ci occur positively in leaf T-lemmas that are closer to 
the root of 11. For the same reason, the first T-lemma Ci analyzed in step (2) contains 
no negative AS-mixed interface equalities. Therefore, it follows by induction that all 
negative AS-mixed interface equalities in Ci must have been split in Step (4) of a pre- 
vious iteration of the loop (3)-(7) of Algorithm 2, and thus they occur in <t. The same 
argument can be used to show also that at steps (5) and (8) every negative ^B-mixed 
interface equality in ^/ife occurs in a. 
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(b) We show that: 

6' e" 

(i) Every substep — — of 11' is a valid resolution step; 

(ii) every leaf of H' is a T-lemma; and 

(iii) the root of 11' is the same as that of 11. 

(i) The only problematic case is the resolution step 

-^{ai = U) V -^{U = h) V -^^lk C'i 
-n(ti = bi) V -^r]k V -^r]i 

introduced in step (5) of Algorithm 2. In this case, we have to show that at the 
end of the algorithm, all the negative AB-mixed interface equalities in -i/i^ have 
been replaced such that the result is identical to -^rjk- We already know that all 
negative AB-mixed equalities in -^fi^ occur in a, thus we only have to show that 
^[^ej] cannot change between the time when -^ej was rewritten to obtain -^Tjk 
and the time in which it is rewritten in -i/ife. The negative equality is replaced 
in ^/ife at the next iteration of the algorithm (in step (5) for inner nodes, and in 
step (8) for the final leaf). In the meantime, the only update to a is performed in 
step (6), but it involves the negative equality -i(ai = hi), which does not occur 
in -i^ife. 

Let Ci be a T-lemma in 11. First, we observe that if C; = ^(oj = bi) V ^/ii, then 
for any ti also the clause C* == -1(0.; = U) V -^{ti = bi) V -i/i^ is a T-lemma, 
since (a^ = U) A {ti = bi) (a, = bi) by transitivity. Therefore, it follows 
by induction on the number of substitutions that the clauses obtained in steps 
(3) and (8) of Algorithm 2 are still T-lemmas. Finally, since we are considering 
equality-interpolating theories, after step (4) of Algorithm 2 both and C" are 
T-lemmas. 

Since the root of 11 does not contain any interface equality (item (iii) of Defini- 
tion 6.3), in step (5) -iry; = -^fii and -ir/fe = -^fik, and therefore the root does not 
change. 

□ 

Clearly, Algorithm 2 operates in linear time on the number of T-lemmas, and thus of 
A_B-mixed interface equalities. Moreover, every time an interface equality is split, only 
two new nodes are added to the proof (a right leaf and an inner node), and therefore the 
size of n' is linear in that of 11. 

The advantage of having ie -local proofs is that they ease significantly the process of 
eliminating AB-mixed interface equalities. First, since all the reasoning involving inter- 
face equalities is confined in 11'^ subproofs, only such subproofs - which typically con- 
stitute only a small fraction of the whole proof - need to be traversed and manipulated. 
Second, the simple structure of 11'^ subproofs allows for an efficient application of the 
rewriting process of steps (5) and (3), preventing any explosion in size of the proof. In 
fact, e.g., if in step (5) the right premise of the last step were instead the root of some 
subproof Hi with d as a leaf, then two copies of II'^ and 11" would be produced, in which 
each instance of [m = bi) bust be replaced with (a^ — ti) and {ti — bi) respectively. 

6.2.2 Generating ie -local proofs in DTC. In this section we show how to implement 
a variant of DTC so that to generate ie -local proofs of unsatisfiability. For the sake of 
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Fig. 11. Simple strategy for generating ie -local proofs. Left: DTC search; top-right: corresponding {sub)proof; 
bottom-right: EI"^ (sub)proof after rewriting. 

simplicity, we describe first a simplified algorithm which makes use of two distinct DPLL 
engines. We then describe how to avoid the need of a second DPLL engine with the use of 
a particular search strategy for DTC. 

The simplified algorithm uses two distinct DPLL engines, a main one and an auxiliary 
one, which we shall call DPLL-l and DPLL-2 respectively. Consider Figure 11, left. 
DPLL-l receives in input the clauses of the input problem <f> (which we assume pure and 
7i U 72 -inconsistent), but no interface equality, which are instead given to DPLL-2. DPLL- 
1 enumerates total Boolean models /i of <j>, and invokes the two T^-solvers separately on 
the subsets /iri and /iTa of /i. If one Ti-solver reports an inconsistency, then DPLL-l 
backtracks. Otherwise, both /ir^ are T^-consistent, and DPLL-2 is invoked on the list of 
unit clauses composed of the Ti U 72 -literals in ji, to check its 7i U 72 -consistency. 

DPLL-2 branches only on interface equalities, assigning them always to false first. 
Some interface equalities ej, however, may be assigned to true by unit-propagation on 
previously-learned clauses in the form C{ = V ej, or by T-propagation on deduction 
clauses C{ in the same form; we call C{ the antecedent clause of e{ . (As in [Brut- 
tomesso et al. 2008a], we assume that when a T-propagation step /j,^ \=r occurs, 
being a subset of the current branch, the deduction clause Cf = -^fij V is learned, ei- 
ther temporarily or permanently; if so, we can see this step as a unit-propagation on Cf.) 
When all the interface equalities have been assigned a truth value, the propositional model 
fi' = fiTi U /ir2 U yUie is checked for Ti U 72 -consistency by invoking each of the T-solvers 



^^Notationally, denotes the j-th most-recently unit-propagated interface equality in the branch in which d is 
learned, and = -^/^^ V denotes the antecedent clause of . 

ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY. 



38 • Cimatti, Griggio and Sebastiani 



on i^Ti U /ije- Since cf) is inconsistent, one of the two T^-solvers detects an inconsistency 
(if both do, we consider only the first). Therefore a 7i-lemma Ci is generated. As stated 
at the end of §6.1.1, we can assume w.l.o.g. that Ci contains at most one positive interface 
equality ei. (Notice also that all negative interface equalities ^ej in Ci, if any, have been 
assigned by unit-propagation or T-propagation on some antecedent clause Cl.) DPLL-2 
then learns Ci and uses it as conflicting clause to backjump: starting from Ci, it eliminates 
from the clause every -ie{ by resolving the current clause against its antecedent clause , 
until no negated equality occurs in the final clause C*. 

If Cl includes one positive interface equality ei, then also the final clause includes 
it, so that DPLL-2 uses C* as a conflict clause to jump up to /i and to unit-propagate ei. 
Then DPLL-2 starts exploring a new branch. This process is repeated on several branches, 
learning a sequence of T-lemmas Ci, ...,Ck each Ci containing only one positive interface 
equality ei, until a branch causes the generation of a T-lemma Ck+i containing no positive 
interface equalities. Then Ck+i is resolved backward against the antecedent clauses of 
its negative interface equalities, generating a final conflict clause C* which contains no 
interface equalities. 

Overall, DPLL-2 has checked the 7i U 72-unsatisfiabiUty of /i, building a resolution 
(sub)proof n* whose root is C*. (Figure 11, top right.) Then the 71 U 72-lemma C* is 
passed to DPLL-1, which uses it as a blocking clause for the assignment fi, it backtracks 
and continues the search. When the empty clause is obtained, it generates a proof of 
unsatisfiability in the usual way (see e.g. [van Gelder 2007]). 

Since the main solver knows nothing about interface equalities, they can only appear 
inside the proofs of the blocking clauses generated by the auxiliary solver (like 11*). Each 
n* is not yet a H'^ subproof, since it complies only with items (i), (ii) and (iii) of Def- 
inition 6.3 but not with item (iv). The reason for the latter fact is that 11* contains a 
set of right branches Ilci, one of each T-lemma Ci in {Ck+i, Ci}, representing the 
resolution steps to resolve away the interface equalities introduced by unit-propagation/T- 
propagation in each branch. Each such sub-branch He. , however, can be reduced to length 
one by moving downwards the resolution steps with the antecedent clauses Cf^Cf, ... 
which Ci encounters in the branch. (Figure 11, bottom right.) This is done by recursively 
applying the following rewriting step to He, , until it reduces to the single clause Cf. 





Cl C, 












-^fif V ^n'/ V e. 




=^ -/i^ V ^fii V -m" 

(12) 

As a result, each 11* is transformed into a 11'^ subproof, so that the final proof is ie -local. 



^^In fact, it is not necessary to wait for all interface equalities to have a value before invoking the 7i-solvers. 
Rather, the standard early pruning optimization (see §2.2) can be applied. 

^^In order to determine the order in which to eliminate the interface equalities, the implication graph of the 
auxiliary DPLL engine can be used. This is a standard process in the conflict analysis in modern SAT and SMT 
solvers (see, e.g., [van Gelder 2007; Sebastiani 2007]). 
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In an actual implementation, there is no need of having two distinct DPLL solvers for 
constructing ie -local proofs. In fact, we can obtain the same result by adopting a variant 
of the DTC Strategy 1 of [Bruttomesso et al. 2008a]. We never select an interface equality 
for case splitting if there is some other unassigned atom, and we always assign false to 
interface equalities first. Moreover, we "delay" T-propagation of interface equalities until 
all the original atoms have been assigned a truth value. Finally, when splitting on inter- 
face equalities, we restrict both the backjumping and the learning procedures of the DPLL 
engine as follows. Let d be the depth in the DPLL tree at which the first interface equality 
is selected for case splitting. If during the exploration of the current DPLL branch we have 
to backjump above d, then we generate by resolution a conflict clause that does not con- 
tain any interface equality, and "deactivate" all the T-lemmas containing some interface 
equality — that is, we do not use such T-lemmas for performing unit propagation — and 
we re-activate them only when we start splitting on interface equalities again. Using such 
strategy, we obtain the same effect as in the simple algorithm using two DPLL engines: the 
search space is partitioned in two distinct subspaces, the one of original atoms and the one 
of interface equalities, and the generated proof of unsatisfiability reflects such partition. 

Finally, we remark that what described above is only one possible strategy for generating 
ie -local proofs, and not necessarily the most efficient one. Moreover, that of generating 
ie -local proofs is only a sufficient condition to obtain interpolants from DTC avoiding du- 
plications of sub-proofs, and more general strategies may be conceived. The investigation 
of alternative strategies is part of ongoing and future work. 

6.3 Discussion 

Our new DTC -based combination method has several advantages over the traditional one 
of [Yorsh and Musuvathi 2005] based on NO: 

(1) It inherits all the advantages of DTC over the traditional NO in terms of versatility, 
efficiency and restrictions imposed to T-solvers [Bozzano et al. 2006; Bruttomesso 
et al. 2008a]. Moreover, it allows for using a more modern SMT solver, since many 
state-of-the-art solvers adopt variants or extensions of DTC instead of NO. 

(2) Instead of requiring an "ad-hoc" method for performing the combination, it exploits 
the Boolean interpolation algorithm. In fact, thanks to the fact that interface equalities 
occur in the proof of unsatisfiability 11, once the AB-mixed terms in 11 are split there is 
no need of any interpolant-combination method at all. In contrast, with the NO-based 
method of [Yorsh and Musuvathi 2005] interpolants for Ti U T-lemmas are generated 
by combining "theory-specific partial interpolants" for the two %'s with an algorithm 
that essentially duplicates the work that in our case is performed by the Boolean algo- 
rithm. This allows also for potentially exploiting optimization techniques for Boolean 
interpolation which are or will be made available from the Uterature. 

(3) By splitting AB-mixed terms only after the construction of the proof 11, it allows 
for computing several interpolants for several different partitions of the input problem 
into (^4, B) from the same proof li . This is particularly important for applications in 
abstraction refinement [Henzinger et al. 2004]. (This feature is discussed in §6.4.) 

The work of [Yorsh and Musuvathi 2005] can in principle deal with non-convex theories. 
Our approach is currently limited to the case of convex theories; however, we see no reason 
that would prevent from it being extensible at least theoretically to the case of nonconvex 
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theories. Extending the approach to non-convex theories is part of ongoing work. We also 
remark that implementing the algorithm of [Yorsh and Musuvathi 2005] for non-convex 
theories is a non-trivial task, and in fact we are not aware of any such implementation. 

Another algorithm for computing interpolants in combined theories is given in [Sofronie- 
Stokkermans 2006]. Rather than a combination of theories with disjoint signatures, that 
work considers the interpolation problem for extensions of a base (convex) theory with new 
function symbols, and it is therefore orthogonal to ours. The solution adopted is however 
similar to what we propose, in the sense that also the algorithm of [Sofronie-Stokkermans 
2006] works by splitting AB-mixed terms. The difference is that our algorithm is tightly 
integrated in an SMT context, as it is guided by the resolution proof generated by the DPLL 
engine. 

6.4 Generating multiple interpolants 

In §2.3 we remarked that a sufficient condition for generating multiple interpolants is that 
all the interpolants /^'s are computed from the same proof of unsatisfiability. When 
generating interpolants with our DTC -based algorithm, however, we generate a different 
proof of unsatisfiability 11^ for each partition of the input formula 4> into Ai and Bi. In 
particular, every 11^ is obtained from the same "base" proof 11, by splitting all the AiBi- 
mixed interface equalities with the algorithm described in §6.2. In this section, we show 
that (2) (at §2.3) holds also when each 11; is obtained from the same ie -local proof 11 by 
the rewriting of Algorithm 2 of §6.2.1. In order to do so, we need the following lemma. 

Lemma 6.5. Let Q be a Ti U T2-lemma, and let Hbe a H'^ proof for it which does not 
contain any AB-mixed term. Then the formula Iq associated to Q in Algorithm 1 is an 
interpolantfor (-18 \ B,^Q [ B). 

Proof. By induction on the structure of 11, we have to prove that: 

(1) -e\s h/e; 

(2) /e A (-6 i S) h ^; 

(3) Iq contains only common symbols. 

The base case is when 11 is just a single leaf. Then, the lemma trivially holds by definition 
of Iq in this case (see Algorithm 1). 

For the inductive step, let Oi = (x = y) V 0i and 62 = ^(a; = y) V (/)2 be the antecedents 
of 9 in n. (So 9 = 01 V 02)- Let Iq^ and Iq^ be the interpolants for 9i and 92 (by the 
inductive hypothesis). 

If {x^y)^ B, then Iq = Iq, V Iq, . 

(1) By the inductive hypothesis, (-i^i A -^{x = y)) \ B = {^(pi \B) /\ -^{x = y) \= 
Iq-,, and (^(/)2 \ B) /\ {x = y) |= Iq,. Then by resolution {^(pi A -^<f>2) \ B = 
^e\B^lQ. 

(2) By the inductive hypothesis, Iq, (pi I B and Iq, (p2 i B, so Iq, V Iq,, |= 

(01 V (/.2) i S, that is /e A (-9 i B) h ^■ 

(3) By the inductive hypothesis both Iq, and Iq, contain only common symbols, and 
so also Iq does. 

If {x ^y) ^ B, then Iq " Iq, A Iq,. 

(1) By the inductive hypothesis, -^(pi \ B \== Iq, and ^02 \ B 1= Iq,, so {^(pi A 
-02)\B = -9\Bh/e. 
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(2) By the inductive hypothesis, we also have that Iq^ \= (f)! I B V {x = y) and 
Ie2 h i S V -^{x = y). Therefore, Iq, A le^ h (0i V ^2) i B, that is 
le A (-9 IB)^±. 

(3) Finally, also in this case both Is^ and Iq^ contain only common symbols, and so 
also Iq does. □ 

We now formalize the sufficient condition of [Henzinger et al. 2004] that (2) holds if 
the /i's are computed from the same H. The proof of it will be useful for showing that (2) 
holds also if the /^'s are computed from Hi's obtained from 11 by splitting the A; -mixed 
interface equalities. 

Theorem 6.6. Let ^ = ^1 A ^2 A 03, and let II be a proof of unsatisfiability for it. Let 
A' = (f>i, _B' = (/)2 A 03, A" = 01 A 02 ond B" = 03, and let I' and I" be two interpolants 
for (A', B') and {A" , B") respectively, both computed from IL Then 

/' A 02 h I"- 

Proof. Let He be a proof whose root is the clause 0. We will prove, by induction on 
the structure of He, that 

4a02 h4v(e\03), 

where Iq is defined as in Algorithm 1 . The validity of the theorem follows immediately, 
by observing that the root of 11 is ±. 

We have to consider three cases: 

(1) The first is when Q is an input clause. Then, we have three subcases: 

(a) If 6 e 03, then /q " T, /q = T and (6 \ 03) ee ±, so the theorem holds. 

(b) ife e 0i,then/^.3 ^" (9 i (02U03)),/^v(e\03) = (6 i 03)v(e\03) = e, 

so the theorem holds also in this case. 

(c) If 8 S 02, then /q A 02 = 02 and /q V (6 \ 03) = 0, so again the implication 
holds. 

(2) The second is when is a T-lemma. In this case, we have that /q is an interpolant for 
(-.9 \ (02 U 03), -.9 i (02 U 03)) and is an interpolant for (-.9 \ 03, -.9 [ 03). 
Therefore, by the definition of interpolant, (-'9\(02U03)) \= Iq and (-'9\03) |= Iq. 
Therefore, /q V (9 \ (02 U 03)) and /q V (9 \ 03) are valid clauses, and so the 
implication trivially holds. 

(3) In this case 9 is obtained by resolution from 9i = V p and 92 == V -ip. If p G 0i 
orp e 03, then by the inductive hypotheses that /q. A 02 ^ I'^.y (9^ \ 03), we have 
thatJ^ A02 h^e V(9\03). 

If p e 02, then I'q = I'q^ A I'q^ and " /^^ V /^,. Again, by the inductive 
hypotheses I'q A 02 |= ^ (9 \ 03) holds. □ 

Theorem 6.7. Lef = 0i A 02 A 03. Let A' = 0i, A" = 0i A 02, i?' = 02 A 03, and 
B" 03- Let IV be a proof of unsatisfiability for 0, and let H' and H" be obtained from H 
by splitting all the A' B' -mixed and A" B" -mixed interface equalities respectively. Let I' 
be an interpolant for {A' ^ B') computed from 11', and I" be an interpolant for (A", B") 
computed from 11". Then 

J' A 02 h I"- 
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Proof. We observe that n' and H" are identical except for some H'^ subproofs that 
contained some mixed interface equalities. Then, we can proceed as in Theorem 6.6, we 
just need to consider one more case, namely when O is a 7i U 72-lemma at the root of a YL'^ 
subproof. In this case, thanks to Lemma 6.5 we have the same situation as in the second 
case of the proof of Theorem 6.6, and so we can apply the same argument. □ 

Thus, due to Theorem 6.7, we can use our DTC-based interpolation method in the con- 
text of abstraction refinement without any modification: it is enough to remember the 
original proof 11, and compute the interpolant li from the proof Hi obtained by splitting 
the AiBi-mixed terms in 11, for each partition of the input formula into Ai and Bi as in 
(1). 

7. EXPERIMENTAL EVALUATION 

The techniques presented in previous sections have been implemented within MathSAT 
4 [Bruttomesso et al. 2008b] MathSAT is an SMT solver supporting a wide range of 
theories and their combinations. In the last SMT solvers competition (SMT-COMP'08), 
it has proved to be competitive with the other state-of-the-art solvers. In this Section, we 
experimentally evaluate our approach. 

7.1 Description of tine benclnmark sets 

We have performed our experiments on two different sets of benchmarks. The first is ob- 
tained by running the Blast software model checker [Beyer et al. 2007] on some Windows 
device drivers; these are similar to those used in [Rybalchenko and Sofronie-Stokkermans 
2007]. This is one of the most important applications of interpolation in formal verifi- 
cation, namely abstraction refinement in the context of CEGAR. The problem represents 
an abstract counterexample trace, and consists of a conjunction of atoms. In this setting, 
the interpolant generator is called very frequently, each time with a relatively simple input 
problem. 

The second set of benchmarks originates from the SMT-LIB [Ranise and Tinelli 2006], 
and is composed of a subset of the unsatisfiable problems used in recent SMT solvers 
competitions (http : //www . smtcomp . org). The instances have been converted to 
CNF and then split in two consistent parts of approximately the same size. The set consists 
of problems of varying difficulty and with a nontrivial Boolean structure. 

The experiments have been performed on a 3GHz Intel Xeon machine with 4GB of RAM 
running Linux. All the tools were run with a timeout of 600 seconds and a memory limit 
of 900 MB. All the benchmark instances, the MathSAT executable, and the set of scripts 
used to perform the experiments are available at http : //disi . unitn . it/~griggio/ 
papers/ tocl-itp. tar. bz2. 

7.2 Comparison witin tine state-of-tlne-art tools available 

In this section, we compare with the other interpolant generators which are available: Foci 
[McMillan 2005; Jhala and McMillan 2006], CLP-PROVER [Rybalchenko and Sofronie- 
Stokkermans 2007] and CSIsat [Beyer et al. 2008]. Other natural candidates for compar- 
ison would have been Zap [Ball et al. 2005] and Lifter [Kroening and Weissenbacher 
2007]; however, it was not possible to obtain them from the authors. We also remark that 
no comparison with INT2 [Jain et al. 2008] is possible, since the domains of applications 
of MathSAT and INT2 are disjoint: INT2 can handle CA{Z) equations/disequations and 
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Family 


# of problems 


MathSAT 


Foci 


CLP-PROVER 


CSISAT 


kbfiltr.i 


64 


0.16 


0.36 


1.47 


0.17 


diskperf.i 


119 


0.33 


0.78 


3.08 


0.39 


floppy.! 


235 


0.73 


1.64 


5.91 


0.86 


cdaudio.i 


130 


0.35 


1.07 


2.98 


0.47 



Fig. 12. Compaiison of execution times of MathSAT, Foci, CLP-PROVER and CSISAT on problems generated 
by Blast. 




Fig. 13. Comparison of MATHSAT and FOCI on SMT-LIB instances: execution time (left), and size of the 
interpolant (right). In the left plot, points on the horizontal and vertical lines are timeouts/failures. 



Execution Time 



1000 



100 



10 



u 

0.1 







9# °° //// y 
°o / 

o / // / / 

» ///// 

/ / / / / 

°M//\y'''/ 
////'/ 

/ / / / / 1 2x 1 
■ ■■■ ,-' / 1 4x 1 





0.01 0.1 1 10 100 1000 

MathSAT 



Fig. 14. Comparison of MATHSAT and CLP-PROVER on conjunctions of £^((Q) atoms. 



modular equations but only conjunctions of literals, whereas MathSAT can handle formu- 
las with arbitrary Boolean structure, but does not support £^(Z) except for its fragments 
I?£(Z) and UTVVT{T). 
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Execution Time Execution Time 




MATHSAT MathSAT 

Fig. 15. Comparison of MathSAT and CSISAT on P'S- Comparison of MathSAT and CSISAT on 
SMT-LIB instances. conjunctions of CA{Q) atoms. 



The comparisonhad to be adapted to the hmitations of Foci, clp-prover and CSIsat. 
In fact, the current version of Foci which is pubHcally available does not handle the full 
CA{Q), but only the fragment'^. We also notice that the interpolants it generates 

are not always 2?£(Q) formulas. (See, e.g.. Example 4. 1 of Section 4.) CLP-PROVER does 
handle the full CA{Q), but it accepts only conjunctions of atoms, rather than formulas with 
arbitrary Boolean structure. CSISAT, instead, can deal with SUT U CA{Q) formulas with 
arbitrary Boolean structure, but it does not support Boolean variables. These limitations 
made it impossible to compare all the four tools on all the instances of our benchmark sets. 
Therefore, we perform the following comparisons: 

- We compare all the four solvers on the problems generated by Blast; 

- We compare MathSAT with Foci on SMT-LIB instances in the theories of SUJ^, 
'DC{Q) and their combination. In this case, we compare both the execution times 
and the sizes of the generated interpolants (in terms of number of nodes in the DAG 
representation of the formula). For computing interpolants in EUJ-, we apply the 
algorithm of [McMillan 2005], using an extension of the algorithm of [Nieuwenhuis 
and Oliveras 2007] to generate £UJ- proof trees. The combination £UJ- U P£(Q) is 
handled with the technique described in §6; 

- We compare MathSAT, clp-prover and CSIsat on CA{'Q) problems consisting 
of conjunctions of atoms. These problems are single branches of the search trees ex- 
plored by MathSAT for some CA{Q) instances in the SMT-LIB. We have collected 
several problems that took more than 0. 1 seconds to MathSAT to solve, and then ran- 
domly picked 50 of them. In this case, we do not compare the sizes of the interpolants 
as they are always atomic formulas; 

- We compare MathSAT and CSIsat on the subset (Consisting of 78 instances of the 
about 400 collected) of the SMT-LIB instances without Boolean variables. 



^^For example, it fails to detect the £yl(Q)-unsatisfiability of the following problem: (0 < y — a; + io) A (0 < 
X — 2 — ui)A(0<2 — 3/ — 1). 
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The results are collected in Figures 12, 13, 14, 15 and 16. We can observe the following 
facts: 

- Interpolation problems generated by Blast are trivial for all the tools. In fact, we 
even had some difficulties in measuring the execution times reliably. Despite this, 
MathS AT and CSISAT seem to be a Httle faster than the others. 

- For problems with a nontrivial Boolean structure, MathSAT outperforms Foci in 
terms of execution time. This is true even for problems in the combined theory £UJ-iJ 
'DC{Q), despite the fact that the current implementation is still preliminary. 

As regards CSISAT, it could solve (within the time and memory limits) only 5 of the 
78 instances it could potentially handle, and in all cases MathSAT outperforms it. 

- In terms of size of the generated interpolants, the gap between MathSAT and Foci 
is smaller on average. However, the right plot of Figure 13 (which considers only 
instances for which both tools were able to generate an interpolant) shows that there 
are more cases in which MathSAT produces a smaller interpolant. 

- On conjunctions of CA{Q) atoms, MathSAT outperforms clp-prover, sometimes 
by more than two orders of magnitude. The performance of MathSAT and CSIsat 
is comparable on such instances, with MathSAT being slightly faster However, 
there are several cases in which CSISAT computes a wrong result, due to the use 
of floating-point arithmetic instead of infinite-precision arithmetic (which is used by 
MathSAT). 

8. CONCLUSIONS AND FUTURE WORK 

In this paper, we have shown how to efficiently build interpolants using state-of-the-art 
SMT solvers. Our methods encompass a wide range of theories (including EUJ-, T>C, 
UTVVI, and CA), and their combination (based on the Delayed Theory Combination 
schema). A thorough experimental evaluation shows that the proposed methods retain the 
efficiency of the solvers, and are vastly superior to the state of the art interpolants, both in 
terms of expressiveness, and in terms of efficiency. 

In the future, we plan to investigate the following issues. First, we will improve the 
implementation of the interpolation method for combined theories, that is currently rather 
naive, and limited to the case of convex theories. Second, we will investigate interpola- 
tion with other rules, in particular Ackermann's expansion. Finally, we will integrate our 
interpolator within a CEGAR loop based on decision procedures, such as BLAST or the 
new version of NuSMV. In fact, such an integration raises interesting problems related to 
controlling the structure of the generated interpolants [Jhala and McMillan 2006; 2007], 
e.g. in order to limit the number or the size of constants occurring in the proof. 
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